Re: [petsc-users] PetscSF Fortran interface

2023-01-10 Thread Nicholas Arnold-Medabalimi
Hi Junchao

 Apologies for not seeing that. Usually, the fortran90-specific functions
have notes on the original C version, and I also can't
see PetscSFCreateSectionSFF90 on the function list on the doc site. Thanks
so much, and I saw your notes on the merge request.

I don't suppose PetscSFReduceBegin and End are likewise hidden somewhere.
I'm moving between distributions, and I can go forward with
PetscSFBcastBegin, but I also need to go backward with Reduce.

I feel like this is a one-to-one change from Bcast to Reduce, and I've
added the relevant lines in src/vec/is/sf/interface/ftn-custom/zsf.c and
src/vec/f90-mod/petscvec.h90 and it compiles fine, but I'm still getting a
linking error for the Reduce routines.

I need some input on what I'm missing here. I hope I didn't miss that this
routine exists elsewhere.

I've attached the two files, but it's not an ideal way to transmit changes.

If I get some instructions on contributing, I can make a merge request for
the changes if they are helpful.


Thanks

Nicholas

On Tue, Jan 10, 2023 at 4:42 PM Junchao Zhang 
wrote:

> Hi, Nicholas,
>It seems we have implemented it, but with another name,
> PetscSFCreateSectionSFF90, see
> https://gitlab.com/petsc/petsc/-/merge_requests/5386
>Try it to see if it works!
>
> --Junchao Zhang
>
>
> On Tue, Jan 10, 2023 at 11:45 AM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> Er to be honest I still can't get my stub to compile properly, and I
>> don't know how to go about making a merge request. But here is what I am
>> attempting right now. Let me know how best to proceed
>>
>>
>> Its not exactly clear to me how to setup up the remote offset properly.
>>
>> in src/vec/is/sf/interface/ftn-custom/zsf.c
>>
>> PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection
>> *rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
>> PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
>> {
>>
>>   int * remoteOffsets;
>>   *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**)
>>  PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return;
>>   *ierr = PetscSFCreateSectionSF(*sf,*rootSection,
>> ,*leafSection,*sectionSF);if (*ierr) return;
>>
>> }
>>
>> This is the sticking point.
>>
>> Sincerely
>> Nicholas
>>
>>
>> On Tue, Jan 10, 2023 at 12:38 PM Junchao Zhang 
>> wrote:
>>
>>> Hi, Nicholas,
>>>   Could you make a merge request to PETSc and then our Fortran experts
>>> can comment on your MR?
>>>   Thanks.
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Jan 10, 2023 at 11:10 AM Nicholas Arnold-Medabalimi <
>>> narno...@umich.edu> wrote:
>>>
 Hi Junchao

 I think I'm almost there, but I could use some insight into how to use
 the PETSC_F90_2PTR_PROTO and F90Array1dAccess for the remoteOffset
 parameter input so if another function comes up, I can add it myself
 without wasting your time.
 I am very grateful for your help and time.

 Sincerely
 Nicholas

 On Tue, Jan 10, 2023 at 10:55 AM Junchao Zhang 
 wrote:

> Hi, Nicholas,
>I am not a fortran guy, but I will try to add
> petscsfcreatesectionsf.
>
>Thanks.
> --Junchao Zhang
>
>
> On Tue, Jan 10, 2023 at 12:50 AM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> I think it should be something like this, but I'm not very fluent in
>> Fortran C interop syntax. Any advice would be appreciated. Thanks
>>
>> PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection *
>> rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
>> PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
>> {
>>
>>   int * remoteOffsets;
>>   *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**) &
>> remoteOffsets PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return
>> ;
>>   *ierr = PetscSFCreateSectionSF(*sf,*rootSection, ,*
>> leafSection,*sectionSF);if (*ierr) return;
>>
>> }
>>
>> On Mon, Jan 9, 2023 at 11:41 PM Nicholas Arnold-Medabalimi <
>> narno...@umich.edu> wrote:
>>
>>> Hi Junchao
>>>
>>> Thanks again for your help in November. I've been using the your
>>> merge request branch quite heavily. Would it be possible to add a
>>> petscsfcreatesectionsf interface as well?
>>> I'm trying to write it myself using your commits as a guide but I
>>> have been struggling with handling the section parameter properly.
>>>
>>> Sincerely
>>> Nicholas
>>>
>>> On Sat, Nov 19, 2022 at 9:44 PM Junchao Zhang <
>>> junchao.zh...@gmail.com> wrote:
>>>



 On Sat, Nov 19, 2022 at 8:05 PM Nicholas Arnold-Medabalimi <
 narno...@umich.edu> wrote:

> Hi
>
> Thanks, this is awesome. Thanks for the very prompt fix. Just one
> question: will the array outputs on the fortran side copies 

Re: [petsc-users] [EXTERNAL] GPU implementation of serial smoothers

2023-01-10 Thread Mark Lohry
Thanks Stefano and Chonglin!

DILU in openfoam is our block Jacobi ilu subdomain solvers
>

 are you saying that -pc_type gang -mg_levels_pc_type -mg_levels_ksp_type
richardson gives you something exactly equivalent to DILU?

On Tue, Jan 10, 2023 at 4:04 PM Zhang, Chonglin  wrote:

> I am using the following in my Poisson solver running on GPU, which were
> suggested by Barry and Mark (Dr. Mark Adams).
>   -ksp_type cg
>   -pc_type gamg
>   -mg_levels_ksp_type chebyshev
>   -mg_levels_pc_type jacobi
>
>
> On Jan 10, 2023, at 3:31 PM, Mark Lohry  wrote:
>
> So what are people using for GAMG configs on GPU? I was hoping petsc today
> would be performance competitive with AMGx but it sounds like that's not
> the case?
>
> On Tue, Jan 10, 2023 at 3:03 PM Jed Brown  wrote:
>
>> Mark Lohry  writes:
>>
>> > I definitely need multigrid. I was under the impression that GAMG was
>> > relatively cuda-complete, is that not the case? What functionality works
>> > fully on GPU and what doesn't, without any host transfers (aside from
>> > what's needed for MPI)?
>> >
>> > If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi
>> -mg_levels_ksp_type
>> > richardson is that fully on device, but -mg_levels_pc_type ilu or
>> > -mg_levels_pc_type sor require transfers?
>>
>> You can do `-mg_levels_pc_type ilu`, but it'll be extremely slow (like
>> 20x slower than an operator apply). One can use Krylov smoothers, though
>> that's more synchronization. Automatic construction of operator-dependent
>> multistage smoothers for linear multigrid (because Chebyshev only works for
>> problems that have eigenvalues near the real axis) is something I've wanted
>> to develop for at least a decade, but time is always short. I might put
>> some effort into p-MG with such smoothers this year as we add DDES to our
>> scale-resolving compressible solver.
>>
>
>


Re: [petsc-users] PetscSF Fortran interface

2023-01-10 Thread Junchao Zhang
Hi, Nicholas,
   It seems we have implemented it, but with another name,
PetscSFCreateSectionSFF90, see
https://gitlab.com/petsc/petsc/-/merge_requests/5386
   Try it to see if it works!

--Junchao Zhang


On Tue, Jan 10, 2023 at 11:45 AM Nicholas Arnold-Medabalimi <
narno...@umich.edu> wrote:

> Er to be honest I still can't get my stub to compile properly, and I don't
> know how to go about making a merge request. But here is what I am
> attempting right now. Let me know how best to proceed
>
>
> Its not exactly clear to me how to setup up the remote offset properly.
>
> in src/vec/is/sf/interface/ftn-custom/zsf.c
>
> PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection
> *rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
> PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
> {
>
>   int * remoteOffsets;
>   *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**)
>  PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return;
>   *ierr = PetscSFCreateSectionSF(*sf,*rootSection,
> ,*leafSection,*sectionSF);if (*ierr) return;
>
> }
>
> This is the sticking point.
>
> Sincerely
> Nicholas
>
>
> On Tue, Jan 10, 2023 at 12:38 PM Junchao Zhang 
> wrote:
>
>> Hi, Nicholas,
>>   Could you make a merge request to PETSc and then our Fortran experts
>> can comment on your MR?
>>   Thanks.
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Jan 10, 2023 at 11:10 AM Nicholas Arnold-Medabalimi <
>> narno...@umich.edu> wrote:
>>
>>> Hi Junchao
>>>
>>> I think I'm almost there, but I could use some insight into how to use
>>> the PETSC_F90_2PTR_PROTO and F90Array1dAccess for the remoteOffset
>>> parameter input so if another function comes up, I can add it myself
>>> without wasting your time.
>>> I am very grateful for your help and time.
>>>
>>> Sincerely
>>> Nicholas
>>>
>>> On Tue, Jan 10, 2023 at 10:55 AM Junchao Zhang 
>>> wrote:
>>>
 Hi, Nicholas,
I am not a fortran guy, but I will try to add petscsfcreatesectionsf.

Thanks.
 --Junchao Zhang


 On Tue, Jan 10, 2023 at 12:50 AM Nicholas Arnold-Medabalimi <
 narno...@umich.edu> wrote:

> I think it should be something like this, but I'm not very fluent in
> Fortran C interop syntax. Any advice would be appreciated. Thanks
>
> PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection *
> rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
> PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
> {
>
>   int * remoteOffsets;
>   *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**) &
> remoteOffsets PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return;
>   *ierr = PetscSFCreateSectionSF(*sf,*rootSection, ,*
> leafSection,*sectionSF);if (*ierr) return;
>
> }
>
> On Mon, Jan 9, 2023 at 11:41 PM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> Hi Junchao
>>
>> Thanks again for your help in November. I've been using the your
>> merge request branch quite heavily. Would it be possible to add a
>> petscsfcreatesectionsf interface as well?
>> I'm trying to write it myself using your commits as a guide but I
>> have been struggling with handling the section parameter properly.
>>
>> Sincerely
>> Nicholas
>>
>> On Sat, Nov 19, 2022 at 9:44 PM Junchao Zhang <
>> junchao.zh...@gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Sat, Nov 19, 2022 at 8:05 PM Nicholas Arnold-Medabalimi <
>>> narno...@umich.edu> wrote:
>>>
 Hi

 Thanks, this is awesome. Thanks for the very prompt fix. Just one
 question: will the array outputs on the fortran side copies (and need 
 to be
 deallocated) or direct access to the dmplex?

>>> Direct access to internal data;  no need to deallocate
>>>
>>>

 Sincerely
 Nicholas

 On Sat, Nov 19, 2022 at 8:21 PM Junchao Zhang <
 junchao.zh...@gmail.com> wrote:

> Hi, Nicholas,
>   See this MR,
> https://gitlab.com/petsc/petsc/-/merge_requests/5860
>   It is in testing, but you can try branch
> jczhang/add-petscsf-fortran to see if it works for you.
>
>   Thanks.
> --Junchao Zhang
>
> On Sat, Nov 19, 2022 at 4:16 PM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> Hi Junchao
>>
>> Thanks. I was wondering if there is any update on this. I may
>> write a small interface for those two routines myself in the interim 
>> but
>> I'd appreciate any insight you have.
>>
>> Sincerely
>> Nicholas
>>
>> On Wed, Nov 16, 2022 at 10:39 PM Junchao Zhang <
>> junchao.zh...@gmail.com> wrote:
>>
>>> Hi, Nicholas,
>>>   I will have a look 

Re: [petsc-users] [EXTERNAL] GPU implementation of serial smoothers

2023-01-10 Thread Zhang, Chonglin
I am using the following in my Poisson solver running on GPU, which were 
suggested by Barry and Mark (Dr. Mark Adams).
  -ksp_type cg
  -pc_type gamg
  -mg_levels_ksp_type chebyshev
  -mg_levels_pc_type jacobi

On Jan 10, 2023, at 3:31 PM, Mark Lohry  wrote:

So what are people using for GAMG configs on GPU? I was hoping petsc today 
would be performance competitive with AMGx but it sounds like that's not the 
case?

On Tue, Jan 10, 2023 at 3:03 PM Jed Brown 
mailto:j...@jedbrown.org>> wrote:
Mark Lohry mailto:mlo...@gmail.com>> writes:

> I definitely need multigrid. I was under the impression that GAMG was
> relatively cuda-complete, is that not the case? What functionality works
> fully on GPU and what doesn't, without any host transfers (aside from
> what's needed for MPI)?
>
> If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi -mg_levels_ksp_type
> richardson is that fully on device, but -mg_levels_pc_type ilu or
> -mg_levels_pc_type sor require transfers?

You can do `-mg_levels_pc_type ilu`, but it'll be extremely slow (like 20x 
slower than an operator apply). One can use Krylov smoothers, though that's 
more synchronization. Automatic construction of operator-dependent multistage 
smoothers for linear multigrid (because Chebyshev only works for problems that 
have eigenvalues near the real axis) is something I've wanted to develop for at 
least a decade, but time is always short. I might put some effort into p-MG 
with such smoothers this year as we add DDES to our scale-resolving 
compressible solver.



Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Stefano Zampini
DILU in openfoam is our block Jacobi ilu subdomain solvers

On Tue, Jan 10, 2023, 23:45 Barry Smith  wrote:

>
>   The default is some kind of Jacobi plus Chebyshev, for a certain class
> of problems, it is quite good.
>
>
>
> On Jan 10, 2023, at 3:31 PM, Mark Lohry  wrote:
>
> So what are people using for GAMG configs on GPU? I was hoping petsc today
> would be performance competitive with AMGx but it sounds like that's not
> the case?
>
> On Tue, Jan 10, 2023 at 3:03 PM Jed Brown  wrote:
>
>> Mark Lohry  writes:
>>
>> > I definitely need multigrid. I was under the impression that GAMG was
>> > relatively cuda-complete, is that not the case? What functionality works
>> > fully on GPU and what doesn't, without any host transfers (aside from
>> > what's needed for MPI)?
>> >
>> > If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi
>> -mg_levels_ksp_type
>> > richardson is that fully on device, but -mg_levels_pc_type ilu or
>> > -mg_levels_pc_type sor require transfers?
>>
>> You can do `-mg_levels_pc_type ilu`, but it'll be extremely slow (like
>> 20x slower than an operator apply). One can use Krylov smoothers, though
>> that's more synchronization. Automatic construction of operator-dependent
>> multistage smoothers for linear multigrid (because Chebyshev only works for
>> problems that have eigenvalues near the real axis) is something I've wanted
>> to develop for at least a decade, but time is always short. I might put
>> some effort into p-MG with such smoothers this year as we add DDES to our
>> scale-resolving compressible solver.
>>
>
>


Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Barry Smith

  The default is some kind of Jacobi plus Chebyshev, for a certain class of 
problems, it is quite good.



> On Jan 10, 2023, at 3:31 PM, Mark Lohry  wrote:
> 
> So what are people using for GAMG configs on GPU? I was hoping petsc today 
> would be performance competitive with AMGx but it sounds like that's not the 
> case?
> 
> On Tue, Jan 10, 2023 at 3:03 PM Jed Brown  > wrote:
>> Mark Lohry mailto:mlo...@gmail.com>> writes:
>> 
>> > I definitely need multigrid. I was under the impression that GAMG was
>> > relatively cuda-complete, is that not the case? What functionality works
>> > fully on GPU and what doesn't, without any host transfers (aside from
>> > what's needed for MPI)?
>> >
>> > If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi -mg_levels_ksp_type
>> > richardson is that fully on device, but -mg_levels_pc_type ilu or
>> > -mg_levels_pc_type sor require transfers?
>> 
>> You can do `-mg_levels_pc_type ilu`, but it'll be extremely slow (like 20x 
>> slower than an operator apply). One can use Krylov smoothers, though that's 
>> more synchronization. Automatic construction of operator-dependent 
>> multistage smoothers for linear multigrid (because Chebyshev only works for 
>> problems that have eigenvalues near the real axis) is something I've wanted 
>> to develop for at least a decade, but time is always short. I might put some 
>> effort into p-MG with such smoothers this year as we add DDES to our 
>> scale-resolving compressible solver.



Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Mark Lohry
So what are people using for GAMG configs on GPU? I was hoping petsc today
would be performance competitive with AMGx but it sounds like that's not
the case?

On Tue, Jan 10, 2023 at 3:03 PM Jed Brown  wrote:

> Mark Lohry  writes:
>
> > I definitely need multigrid. I was under the impression that GAMG was
> > relatively cuda-complete, is that not the case? What functionality works
> > fully on GPU and what doesn't, without any host transfers (aside from
> > what's needed for MPI)?
> >
> > If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi
> -mg_levels_ksp_type
> > richardson is that fully on device, but -mg_levels_pc_type ilu or
> > -mg_levels_pc_type sor require transfers?
>
> You can do `-mg_levels_pc_type ilu`, but it'll be extremely slow (like 20x
> slower than an operator apply). One can use Krylov smoothers, though that's
> more synchronization. Automatic construction of operator-dependent
> multistage smoothers for linear multigrid (because Chebyshev only works for
> problems that have eigenvalues near the real axis) is something I've wanted
> to develop for at least a decade, but time is always short. I might put
> some effort into p-MG with such smoothers this year as we add DDES to our
> scale-resolving compressible solver.
>


Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Jed Brown
Mark Lohry  writes:

> I definitely need multigrid. I was under the impression that GAMG was
> relatively cuda-complete, is that not the case? What functionality works
> fully on GPU and what doesn't, without any host transfers (aside from
> what's needed for MPI)?
>
> If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi -mg_levels_ksp_type
> richardson is that fully on device, but -mg_levels_pc_type ilu or
> -mg_levels_pc_type sor require transfers?

You can do `-mg_levels_pc_type ilu`, but it'll be extremely slow (like 20x 
slower than an operator apply). One can use Krylov smoothers, though that's 
more synchronization. Automatic construction of operator-dependent multistage 
smoothers for linear multigrid (because Chebyshev only works for problems that 
have eigenvalues near the real axis) is something I've wanted to develop for at 
least a decade, but time is always short. I might put some effort into p-MG 
with such smoothers this year as we add DDES to our scale-resolving 
compressible solver.


Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Mark Lohry
>
> BTW, on unstructured grids, coloring requires a lot of colors and thus
> many times more bandwidth (due to multiple passes) than the operator itself.


I've noticed -- in AMGx the multicolor GS was generally dramatically slower
than jacobi because of lots of colors with few elements.

You can use sparse triangular kernels like ILU (provided by cuBLAS), but
> they are so mindbogglingly slow that you'll go back to the drawing board
> and try to use a multigrid method of some sort with polynomial/point-block
> smoothing.
>

I definitely need multigrid. I was under the impression that GAMG was
relatively cuda-complete, is that not the case? What functionality works
fully on GPU and what doesn't, without any host transfers (aside from
what's needed for MPI)?

If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi -mg_levels_ksp_type
richardson is that fully on device, but -mg_levels_pc_type ilu or
-mg_levels_pc_type sor require transfers?


On Tue, Jan 10, 2023 at 2:47 PM Jed Brown  wrote:

> The joy of GPUs. You can use sparse triangular kernels like ILU (provided
> by cuBLAS), but they are so mindbogglingly slow that you'll go back to the
> drawing board and try to use a multigrid method of some sort with
> polynomial/point-block smoothing.
>
> BTW, on unstructured grids, coloring requires a lot of colors and thus
> many times more bandwidth (due to multiple passes) than the operator itself.
>
> Mark Lohry  writes:
>
> > Well that's suboptimal. What are my options for 100% GPU solves with no
> > host transfers?
> >
> > On Tue, Jan 10, 2023, 2:23 PM Barry Smith  wrote:
> >
> >>
> >>
> >> On Jan 10, 2023, at 2:19 PM, Mark Lohry  wrote:
> >>
> >> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi
> if
> >>> the node size is not uniform). The are good choices for
> scale-resolving CFD
> >>> on GPUs.
> >>>
> >>
> >> I was hoping you'd know :)  pbjacobi is underperforming ilu by a pretty
> >> wide margin on some of the systems i'm looking at.
> >>
> >> We don't have colored smoothers currently in PETSc.
> >>>
> >>
> >> So what happens under the hood when I run -mg_levels_pc_type sor on GPU?
> >> Are you actually decomposing the matrix into lower and computing updates
> >> with matrix multiplications? Or is it just the standard serial algorithm
> >> with thread safety ignored?
> >>
> >>
> >>   It is running the regular SOR on the CPU and needs to copy up the
> vector
> >> and copy down the result.
> >>
> >>
> >> On Tue, Jan 10, 2023 at 1:52 PM Barry Smith  wrote:
> >>
> >>>
> >>>   We don't have colored smoothers currently in PETSc.
> >>>
> >>> > On Jan 10, 2023, at 12:56 PM, Jed Brown  wrote:
> >>> >
> >>> > Is DILU a point-block method? We have -pc_type pbjacobi (and
> vpbjacobi
> >>> if the node size is not uniform). The are good choices for
> scale-resolving
> >>> CFD on GPUs.
> >>> >
> >>> > Mark Lohry  writes:
> >>> >
> >>> >> I'm running GAMG with CUDA, and I'm wondering how the nominally
> serial
> >>> >> smoother algorithms are implemented on GPU? Specifically SOR/GS and
> >>> ILU(0)
> >>> >> -- in e.g. AMGx these are applied by first creating a coloring, and
> the
> >>> >> smoother passes are done color by color. Is this how it's done in
> >>> petsc AMG?
> >>> >>
> >>> >> Tangential, AMGx and OpenFOAM offer something called "DILU",
> diagonal
> >>> ILU.
> >>> >> Is there an equivalent in petsc?
> >>> >>
> >>> >> Thanks,
> >>> >> Mark
> >>>
> >>>
> >>
>


Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Jed Brown
The joy of GPUs. You can use sparse triangular kernels like ILU (provided by 
cuBLAS), but they are so mindbogglingly slow that you'll go back to the drawing 
board and try to use a multigrid method of some sort with 
polynomial/point-block smoothing.

BTW, on unstructured grids, coloring requires a lot of colors and thus many 
times more bandwidth (due to multiple passes) than the operator itself.

Mark Lohry  writes:

> Well that's suboptimal. What are my options for 100% GPU solves with no
> host transfers?
>
> On Tue, Jan 10, 2023, 2:23 PM Barry Smith  wrote:
>
>>
>>
>> On Jan 10, 2023, at 2:19 PM, Mark Lohry  wrote:
>>
>> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if
>>> the node size is not uniform). The are good choices for scale-resolving CFD
>>> on GPUs.
>>>
>>
>> I was hoping you'd know :)  pbjacobi is underperforming ilu by a pretty
>> wide margin on some of the systems i'm looking at.
>>
>> We don't have colored smoothers currently in PETSc.
>>>
>>
>> So what happens under the hood when I run -mg_levels_pc_type sor on GPU?
>> Are you actually decomposing the matrix into lower and computing updates
>> with matrix multiplications? Or is it just the standard serial algorithm
>> with thread safety ignored?
>>
>>
>>   It is running the regular SOR on the CPU and needs to copy up the vector
>> and copy down the result.
>>
>>
>> On Tue, Jan 10, 2023 at 1:52 PM Barry Smith  wrote:
>>
>>>
>>>   We don't have colored smoothers currently in PETSc.
>>>
>>> > On Jan 10, 2023, at 12:56 PM, Jed Brown  wrote:
>>> >
>>> > Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi
>>> if the node size is not uniform). The are good choices for scale-resolving
>>> CFD on GPUs.
>>> >
>>> > Mark Lohry  writes:
>>> >
>>> >> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
>>> >> smoother algorithms are implemented on GPU? Specifically SOR/GS and
>>> ILU(0)
>>> >> -- in e.g. AMGx these are applied by first creating a coloring, and the
>>> >> smoother passes are done color by color. Is this how it's done in
>>> petsc AMG?
>>> >>
>>> >> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal
>>> ILU.
>>> >> Is there an equivalent in petsc?
>>> >>
>>> >> Thanks,
>>> >> Mark
>>>
>>>
>>


Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Mark Lohry
Well that's suboptimal. What are my options for 100% GPU solves with no
host transfers?

On Tue, Jan 10, 2023, 2:23 PM Barry Smith  wrote:

>
>
> On Jan 10, 2023, at 2:19 PM, Mark Lohry  wrote:
>
> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if
>> the node size is not uniform). The are good choices for scale-resolving CFD
>> on GPUs.
>>
>
> I was hoping you'd know :)  pbjacobi is underperforming ilu by a pretty
> wide margin on some of the systems i'm looking at.
>
> We don't have colored smoothers currently in PETSc.
>>
>
> So what happens under the hood when I run -mg_levels_pc_type sor on GPU?
> Are you actually decomposing the matrix into lower and computing updates
> with matrix multiplications? Or is it just the standard serial algorithm
> with thread safety ignored?
>
>
>   It is running the regular SOR on the CPU and needs to copy up the vector
> and copy down the result.
>
>
> On Tue, Jan 10, 2023 at 1:52 PM Barry Smith  wrote:
>
>>
>>   We don't have colored smoothers currently in PETSc.
>>
>> > On Jan 10, 2023, at 12:56 PM, Jed Brown  wrote:
>> >
>> > Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi
>> if the node size is not uniform). The are good choices for scale-resolving
>> CFD on GPUs.
>> >
>> > Mark Lohry  writes:
>> >
>> >> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
>> >> smoother algorithms are implemented on GPU? Specifically SOR/GS and
>> ILU(0)
>> >> -- in e.g. AMGx these are applied by first creating a coloring, and the
>> >> smoother passes are done color by color. Is this how it's done in
>> petsc AMG?
>> >>
>> >> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal
>> ILU.
>> >> Is there an equivalent in petsc?
>> >>
>> >> Thanks,
>> >> Mark
>>
>>
>


Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Barry Smith


> On Jan 10, 2023, at 2:19 PM, Mark Lohry  wrote:
> 
>> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if 
>> the node size is not uniform). The are good choices for scale-resolving CFD 
>> on GPUs.
> 
> I was hoping you'd know :)  pbjacobi is underperforming ilu by a pretty wide 
> margin on some of the systems i'm looking at.
> 
>> We don't have colored smoothers currently in PETSc.
> 
> So what happens under the hood when I run -mg_levels_pc_type sor on GPU? Are 
> you actually decomposing the matrix into lower and computing updates with 
> matrix multiplications? Or is it just the standard serial algorithm with 
> thread safety ignored?

  It is running the regular SOR on the CPU and needs to copy up the vector and 
copy down the result.
> 
> On Tue, Jan 10, 2023 at 1:52 PM Barry Smith  > wrote:
>> 
>>   We don't have colored smoothers currently in PETSc.
>> 
>> > On Jan 10, 2023, at 12:56 PM, Jed Brown > > > wrote:
>> > 
>> > Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if 
>> > the node size is not uniform). The are good choices for scale-resolving 
>> > CFD on GPUs.
>> > 
>> > Mark Lohry mailto:mlo...@gmail.com>> writes:
>> > 
>> >> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
>> >> smoother algorithms are implemented on GPU? Specifically SOR/GS and ILU(0)
>> >> -- in e.g. AMGx these are applied by first creating a coloring, and the
>> >> smoother passes are done color by color. Is this how it's done in petsc 
>> >> AMG?
>> >> 
>> >> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal ILU.
>> >> Is there an equivalent in petsc?
>> >> 
>> >> Thanks,
>> >> Mark
>> 



Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Mark Lohry
>
> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if
> the node size is not uniform). The are good choices for scale-resolving CFD
> on GPUs.
>

I was hoping you'd know :)  pbjacobi is underperforming ilu by a pretty
wide margin on some of the systems i'm looking at.

We don't have colored smoothers currently in PETSc.
>

So what happens under the hood when I run -mg_levels_pc_type sor on GPU?
Are you actually decomposing the matrix into lower and computing updates
with matrix multiplications? Or is it just the standard serial algorithm
with thread safety ignored?

On Tue, Jan 10, 2023 at 1:52 PM Barry Smith  wrote:

>
>   We don't have colored smoothers currently in PETSc.
>
> > On Jan 10, 2023, at 12:56 PM, Jed Brown  wrote:
> >
> > Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi
> if the node size is not uniform). The are good choices for scale-resolving
> CFD on GPUs.
> >
> > Mark Lohry  writes:
> >
> >> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
> >> smoother algorithms are implemented on GPU? Specifically SOR/GS and
> ILU(0)
> >> -- in e.g. AMGx these are applied by first creating a coloring, and the
> >> smoother passes are done color by color. Is this how it's done in petsc
> AMG?
> >>
> >> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal
> ILU.
> >> Is there an equivalent in petsc?
> >>
> >> Thanks,
> >> Mark
>
>


Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Barry Smith


  We don't have colored smoothers currently in PETSc.

> On Jan 10, 2023, at 12:56 PM, Jed Brown  wrote:
> 
> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if the 
> node size is not uniform). The are good choices for scale-resolving CFD on 
> GPUs.
> 
> Mark Lohry  writes:
> 
>> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
>> smoother algorithms are implemented on GPU? Specifically SOR/GS and ILU(0)
>> -- in e.g. AMGx these are applied by first creating a coloring, and the
>> smoother passes are done color by color. Is this how it's done in petsc AMG?
>> 
>> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal ILU.
>> Is there an equivalent in petsc?
>> 
>> Thanks,
>> Mark



Re: [petsc-users] Eliminating rows and columns which are zeros

2023-01-10 Thread Barry Smith

  Yes, after the solve the x will contain correct values for ALL the locations 
including the (zeroed out rows). You use case is exactly what redistribute it 
for.

  Barry


> On Jan 10, 2023, at 11:25 AM, Karthikeyan Chockalingam - STFC UKRI 
>  wrote:
> 
> Thank you Barry. This is great!
>  
> I plan to solve using ‘-pc_type redistribute’ after applying the Dirichlet bc 
> using
> MatZeroRowsColumnsIS(A, isout, 1, x, b); 
>  
> While I retrieve the solution data from x (after the solve) – can I index 
> them using the original ordering (if I may say that)?
>  
> Kind regards,
> Karthik.
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Tuesday, 10 January 2023 at 16:04
> To: Chockalingam, Karthikeyan (STFC,DL,HC) 
>  >
> Cc: petsc-users@mcs.anl.gov  
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Eliminating rows and columns which are zeros
> 
>  
> https://petsc.org/release/docs/manualpages/PC/PCREDISTRIBUTE/#pcredistribute  
>  -pc_type redistribute
>  
>  
> It does everything for you. Note that if the right hand side for any of the 
> "zero" rows is nonzero then the system is inconsistent and the system does 
> not have a solution.
>  
> Barry
>  
> 
> 
> On Jan 10, 2023, at 10:30 AM, Karthikeyan Chockalingam - STFC UKRI via 
> petsc-users mailto:petsc-users@mcs.anl.gov>> wrote:
>  
> Hello,
>  
> I am assembling a MATIJ of size N, where a very large number of rows (and 
> corresponding columns), are zeros. I would like to potentially eliminate them 
> before the solve.
>  
> For instance say N=7
>  
> 0 0  0  0 0 0 0
> 0 1 -1  0 0 0 0
> 0 -1 2  0 0 0 -1
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  -1 0 0 0 1
>  
> I would like to reduce it to a 3x3
>  
> 1 -1 0
> -1 2 -1
> 0 -1 1
>  
> I do know the size N.
>  
> Q1) How do I do it?
> Q2) Is it better to eliminate them as it would save a lot of memory?
> Q3) At the moment, I don’t know which rows (and columns) have the zero 
> entries but with some effort I probably can find them. Should I know which 
> rows (and columns) I am eliminating?
>  
> Thank you.
>  
> Karthik.
> This email and any attachments are intended solely for the use of the named 
> recipients. If you are not the intended recipient you must not use, disclose, 
> copy or distribute this email or any of its attachments and should notify the 
> sender immediately and delete this email from your system. UK Research and 
> Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
> this email or any attachments containing viruses or malware but the recipient 
> should carry out its own virus and malware checks before opening the 
> attachments. UKRI does not accept any liability for any losses or damages 
> which the recipient may sustain due to presence of any viruses. 
> 



Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Jed Brown
Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if the 
node size is not uniform). The are good choices for scale-resolving CFD on GPUs.

Mark Lohry  writes:

> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
> smoother algorithms are implemented on GPU? Specifically SOR/GS and ILU(0)
> -- in e.g. AMGx these are applied by first creating a coloring, and the
> smoother passes are done color by color. Is this how it's done in petsc AMG?
>
> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal ILU.
> Is there an equivalent in petsc?
>
> Thanks,
> Mark


Re: [petsc-users] PetscSF Fortran interface

2023-01-10 Thread Nicholas Arnold-Medabalimi
Er to be honest I still can't get my stub to compile properly, and I don't
know how to go about making a merge request. But here is what I am
attempting right now. Let me know how best to proceed


Its not exactly clear to me how to setup up the remote offset properly.

in src/vec/is/sf/interface/ftn-custom/zsf.c

PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection
*rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
{

  int * remoteOffsets;
  *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**)
 PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return;
  *ierr = PetscSFCreateSectionSF(*sf,*rootSection,
,*leafSection,*sectionSF);if (*ierr) return;

}

This is the sticking point.

Sincerely
Nicholas


On Tue, Jan 10, 2023 at 12:38 PM Junchao Zhang 
wrote:

> Hi, Nicholas,
>   Could you make a merge request to PETSc and then our Fortran experts can
> comment on your MR?
>   Thanks.
>
> --Junchao Zhang
>
>
> On Tue, Jan 10, 2023 at 11:10 AM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> Hi Junchao
>>
>> I think I'm almost there, but I could use some insight into how to use
>> the PETSC_F90_2PTR_PROTO and F90Array1dAccess for the remoteOffset
>> parameter input so if another function comes up, I can add it myself
>> without wasting your time.
>> I am very grateful for your help and time.
>>
>> Sincerely
>> Nicholas
>>
>> On Tue, Jan 10, 2023 at 10:55 AM Junchao Zhang 
>> wrote:
>>
>>> Hi, Nicholas,
>>>I am not a fortran guy, but I will try to add petscsfcreatesectionsf.
>>>
>>>Thanks.
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Jan 10, 2023 at 12:50 AM Nicholas Arnold-Medabalimi <
>>> narno...@umich.edu> wrote:
>>>
 I think it should be something like this, but I'm not very fluent in
 Fortran C interop syntax. Any advice would be appreciated. Thanks

 PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection *
 rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
 PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
 {

   int * remoteOffsets;
   *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**) &
 remoteOffsets PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return;
   *ierr = PetscSFCreateSectionSF(*sf,*rootSection, ,*
 leafSection,*sectionSF);if (*ierr) return;

 }

 On Mon, Jan 9, 2023 at 11:41 PM Nicholas Arnold-Medabalimi <
 narno...@umich.edu> wrote:

> Hi Junchao
>
> Thanks again for your help in November. I've been using the your merge
> request branch quite heavily. Would it be possible to add a
> petscsfcreatesectionsf interface as well?
> I'm trying to write it myself using your commits as a guide but I have
> been struggling with handling the section parameter properly.
>
> Sincerely
> Nicholas
>
> On Sat, Nov 19, 2022 at 9:44 PM Junchao Zhang 
> wrote:
>
>>
>>
>>
>> On Sat, Nov 19, 2022 at 8:05 PM Nicholas Arnold-Medabalimi <
>> narno...@umich.edu> wrote:
>>
>>> Hi
>>>
>>> Thanks, this is awesome. Thanks for the very prompt fix. Just one
>>> question: will the array outputs on the fortran side copies (and need 
>>> to be
>>> deallocated) or direct access to the dmplex?
>>>
>> Direct access to internal data;  no need to deallocate
>>
>>
>>>
>>> Sincerely
>>> Nicholas
>>>
>>> On Sat, Nov 19, 2022 at 8:21 PM Junchao Zhang <
>>> junchao.zh...@gmail.com> wrote:
>>>
 Hi, Nicholas,
   See this MR, https://gitlab.com/petsc/petsc/-/merge_requests/5860
   It is in testing, but you can try branch
 jczhang/add-petscsf-fortran to see if it works for you.

   Thanks.
 --Junchao Zhang

 On Sat, Nov 19, 2022 at 4:16 PM Nicholas Arnold-Medabalimi <
 narno...@umich.edu> wrote:

> Hi Junchao
>
> Thanks. I was wondering if there is any update on this. I may
> write a small interface for those two routines myself in the interim 
> but
> I'd appreciate any insight you have.
>
> Sincerely
> Nicholas
>
> On Wed, Nov 16, 2022 at 10:39 PM Junchao Zhang <
> junchao.zh...@gmail.com> wrote:
>
>> Hi, Nicholas,
>>   I will have a look and get back to you.
>>   Thanks.
>> --Junchao Zhang
>>
>>
>> On Wed, Nov 16, 2022 at 9:27 PM Nicholas Arnold-Medabalimi <
>> narno...@umich.edu> wrote:
>>
>>> Hi Petsc Users
>>>
>>> I'm in the process of adding some Petsc for mesh management into
>>> an existing Fortran Solver. It has been relatively straightforward 
>>> so far
>>> but I am running into an issue with using PetscSF routines. 

Re: [petsc-users] PetscSF Fortran interface

2023-01-10 Thread Junchao Zhang
Hi, Nicholas,
  Could you make a merge request to PETSc and then our Fortran experts can
comment on your MR?
  Thanks.

--Junchao Zhang


On Tue, Jan 10, 2023 at 11:10 AM Nicholas Arnold-Medabalimi <
narno...@umich.edu> wrote:

> Hi Junchao
>
> I think I'm almost there, but I could use some insight into how to use the
> PETSC_F90_2PTR_PROTO and F90Array1dAccess for the remoteOffset parameter
> input so if another function comes up, I can add it myself without wasting
> your time.
> I am very grateful for your help and time.
>
> Sincerely
> Nicholas
>
> On Tue, Jan 10, 2023 at 10:55 AM Junchao Zhang 
> wrote:
>
>> Hi, Nicholas,
>>I am not a fortran guy, but I will try to add petscsfcreatesectionsf.
>>
>>Thanks.
>> --Junchao Zhang
>>
>>
>> On Tue, Jan 10, 2023 at 12:50 AM Nicholas Arnold-Medabalimi <
>> narno...@umich.edu> wrote:
>>
>>> I think it should be something like this, but I'm not very fluent in
>>> Fortran C interop syntax. Any advice would be appreciated. Thanks
>>>
>>> PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection *
>>> rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
>>> PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
>>> {
>>>
>>>   int * remoteOffsets;
>>>   *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**) &
>>> remoteOffsets PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return;
>>>   *ierr = PetscSFCreateSectionSF(*sf,*rootSection, ,*
>>> leafSection,*sectionSF);if (*ierr) return;
>>>
>>> }
>>>
>>> On Mon, Jan 9, 2023 at 11:41 PM Nicholas Arnold-Medabalimi <
>>> narno...@umich.edu> wrote:
>>>
 Hi Junchao

 Thanks again for your help in November. I've been using the your merge
 request branch quite heavily. Would it be possible to add a
 petscsfcreatesectionsf interface as well?
 I'm trying to write it myself using your commits as a guide but I have
 been struggling with handling the section parameter properly.

 Sincerely
 Nicholas

 On Sat, Nov 19, 2022 at 9:44 PM Junchao Zhang 
 wrote:

>
>
>
> On Sat, Nov 19, 2022 at 8:05 PM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> Hi
>>
>> Thanks, this is awesome. Thanks for the very prompt fix. Just one
>> question: will the array outputs on the fortran side copies (and need to 
>> be
>> deallocated) or direct access to the dmplex?
>>
> Direct access to internal data;  no need to deallocate
>
>
>>
>> Sincerely
>> Nicholas
>>
>> On Sat, Nov 19, 2022 at 8:21 PM Junchao Zhang <
>> junchao.zh...@gmail.com> wrote:
>>
>>> Hi, Nicholas,
>>>   See this MR, https://gitlab.com/petsc/petsc/-/merge_requests/5860
>>>   It is in testing, but you can try branch
>>> jczhang/add-petscsf-fortran to see if it works for you.
>>>
>>>   Thanks.
>>> --Junchao Zhang
>>>
>>> On Sat, Nov 19, 2022 at 4:16 PM Nicholas Arnold-Medabalimi <
>>> narno...@umich.edu> wrote:
>>>
 Hi Junchao

 Thanks. I was wondering if there is any update on this. I may write
 a small interface for those two routines myself in the interim but I'd
 appreciate any insight you have.

 Sincerely
 Nicholas

 On Wed, Nov 16, 2022 at 10:39 PM Junchao Zhang <
 junchao.zh...@gmail.com> wrote:

> Hi, Nicholas,
>   I will have a look and get back to you.
>   Thanks.
> --Junchao Zhang
>
>
> On Wed, Nov 16, 2022 at 9:27 PM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> Hi Petsc Users
>>
>> I'm in the process of adding some Petsc for mesh management into
>> an existing Fortran Solver. It has been relatively straightforward 
>> so far
>> but I am running into an issue with using PetscSF routines. Some 
>> like the
>> PetscSFGetGraph work no problem but a few of my routines require the 
>> use of
>> PetscSFGetLeafRanks and PetscSFGetRootRanks and those don't seem to 
>> be in
>> the fortran interface and I just get a linking error. I also don't 
>> seem to
>> see a PetscSF file in the finclude. Any clarification or assistance 
>> would
>> be appreciated.
>>
>>
>> Sincerely
>> Nicholas
>>
>> --
>> Nicholas Arnold-Medabalimi
>>
>> Ph.D. Candidate
>> Computational Aeroscience Lab
>> University of Michigan
>>
>

 --
 Nicholas Arnold-Medabalimi

 Ph.D. Candidate
 Computational Aeroscience Lab
 University of Michigan

>>>
>>
>> --
>> Nicholas Arnold-Medabalimi
>>
>> Ph.D. Candidate
>> Computational Aeroscience Lab
>> University of 

[petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Mark Lohry
I'm running GAMG with CUDA, and I'm wondering how the nominally serial
smoother algorithms are implemented on GPU? Specifically SOR/GS and ILU(0)
-- in e.g. AMGx these are applied by first creating a coloring, and the
smoother passes are done color by color. Is this how it's done in petsc AMG?

Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal ILU.
Is there an equivalent in petsc?

Thanks,
Mark


Re: [petsc-users] PetscSF Fortran interface

2023-01-10 Thread Nicholas Arnold-Medabalimi
Hi Junchao

I think I'm almost there, but I could use some insight into how to use the
PETSC_F90_2PTR_PROTO and F90Array1dAccess for the remoteOffset parameter
input so if another function comes up, I can add it myself without wasting
your time.
I am very grateful for your help and time.

Sincerely
Nicholas

On Tue, Jan 10, 2023 at 10:55 AM Junchao Zhang 
wrote:

> Hi, Nicholas,
>I am not a fortran guy, but I will try to add petscsfcreatesectionsf.
>
>Thanks.
> --Junchao Zhang
>
>
> On Tue, Jan 10, 2023 at 12:50 AM Nicholas Arnold-Medabalimi <
> narno...@umich.edu> wrote:
>
>> I think it should be something like this, but I'm not very fluent in
>> Fortran C interop syntax. Any advice would be appreciated. Thanks
>>
>> PETSC_EXTERN void petscsfcreatesectionsf(PetscSF *sf, PetscSection *
>> rootSection, F90Array1d *aremoteOffsets, PetscSection *leafSection,
>> PetscSF *sectionSF, int * ierr PETSC_F90_2PTR_PROTO(remoteoffsetsd))
>> {
>>
>>   int * remoteOffsets;
>>   *ierr = F90Array1dAccess(aremoteOffsets, PETSC_INT, (void**) &
>> remoteOffsets PETSC_F90_2PTR_PARAM(remoteoffsetsd));if (*ierr) return;
>>   *ierr = PetscSFCreateSectionSF(*sf,*rootSection, ,*
>> leafSection,*sectionSF);if (*ierr) return;
>>
>> }
>>
>> On Mon, Jan 9, 2023 at 11:41 PM Nicholas Arnold-Medabalimi <
>> narno...@umich.edu> wrote:
>>
>>> Hi Junchao
>>>
>>> Thanks again for your help in November. I've been using the your merge
>>> request branch quite heavily. Would it be possible to add a
>>> petscsfcreatesectionsf interface as well?
>>> I'm trying to write it myself using your commits as a guide but I have
>>> been struggling with handling the section parameter properly.
>>>
>>> Sincerely
>>> Nicholas
>>>
>>> On Sat, Nov 19, 2022 at 9:44 PM Junchao Zhang 
>>> wrote:
>>>



 On Sat, Nov 19, 2022 at 8:05 PM Nicholas Arnold-Medabalimi <
 narno...@umich.edu> wrote:

> Hi
>
> Thanks, this is awesome. Thanks for the very prompt fix. Just one
> question: will the array outputs on the fortran side copies (and need to 
> be
> deallocated) or direct access to the dmplex?
>
 Direct access to internal data;  no need to deallocate


>
> Sincerely
> Nicholas
>
> On Sat, Nov 19, 2022 at 8:21 PM Junchao Zhang 
> wrote:
>
>> Hi, Nicholas,
>>   See this MR, https://gitlab.com/petsc/petsc/-/merge_requests/5860
>>   It is in testing, but you can try branch
>> jczhang/add-petscsf-fortran to see if it works for you.
>>
>>   Thanks.
>> --Junchao Zhang
>>
>> On Sat, Nov 19, 2022 at 4:16 PM Nicholas Arnold-Medabalimi <
>> narno...@umich.edu> wrote:
>>
>>> Hi Junchao
>>>
>>> Thanks. I was wondering if there is any update on this. I may write
>>> a small interface for those two routines myself in the interim but I'd
>>> appreciate any insight you have.
>>>
>>> Sincerely
>>> Nicholas
>>>
>>> On Wed, Nov 16, 2022 at 10:39 PM Junchao Zhang <
>>> junchao.zh...@gmail.com> wrote:
>>>
 Hi, Nicholas,
   I will have a look and get back to you.
   Thanks.
 --Junchao Zhang


 On Wed, Nov 16, 2022 at 9:27 PM Nicholas Arnold-Medabalimi <
 narno...@umich.edu> wrote:

> Hi Petsc Users
>
> I'm in the process of adding some Petsc for mesh management into
> an existing Fortran Solver. It has been relatively straightforward so 
> far
> but I am running into an issue with using PetscSF routines. Some like 
> the
> PetscSFGetGraph work no problem but a few of my routines require the 
> use of
> PetscSFGetLeafRanks and PetscSFGetRootRanks and those don't seem to 
> be in
> the fortran interface and I just get a linking error. I also don't 
> seem to
> see a PetscSF file in the finclude. Any clarification or assistance 
> would
> be appreciated.
>
>
> Sincerely
> Nicholas
>
> --
> Nicholas Arnold-Medabalimi
>
> Ph.D. Candidate
> Computational Aeroscience Lab
> University of Michigan
>

>>>
>>> --
>>> Nicholas Arnold-Medabalimi
>>>
>>> Ph.D. Candidate
>>> Computational Aeroscience Lab
>>> University of Michigan
>>>
>>
>
> --
> Nicholas Arnold-Medabalimi
>
> Ph.D. Candidate
> Computational Aeroscience Lab
> University of Michigan
>

>>>
>>> --
>>> Nicholas Arnold-Medabalimi
>>>
>>> Ph.D. Candidate
>>> Computational Aeroscience Lab
>>> University of Michigan
>>>
>>
>>
>> --
>> Nicholas Arnold-Medabalimi
>>
>> Ph.D. Candidate
>> Computational Aeroscience Lab
>> University of Michigan
>>
>

-- 
Nicholas Arnold-Medabalimi

Ph.D. Candidate
Computational Aeroscience Lab
University of Michigan


Re: [petsc-users] Eliminating rows and columns which are zeros

2023-01-10 Thread Karthikeyan Chockalingam - STFC UKRI via petsc-users
Thank you Barry. This is great!

I plan to solve using ‘-pc_type redistribute’ after applying the Dirichlet bc 
using

MatZeroRowsColumnsIS(A, isout, 1, x, b);


While I retrieve the solution data from x (after the solve) – can I index them 
using the original ordering (if I may say that)?

Kind regards,
Karthik.

From: Barry Smith 
Date: Tuesday, 10 January 2023 at 16:04
To: Chockalingam, Karthikeyan (STFC,DL,HC) 
Cc: petsc-users@mcs.anl.gov 
Subject: Re: [petsc-users] Eliminating rows and columns which are zeros

https://petsc.org/release/docs/manualpages/PC/PCREDISTRIBUTE/#pcredistribute   
-pc_type redistribute


It does everything for you. Note that if the right hand side for any of the 
"zero" rows is nonzero then the system is inconsistent and the system does not 
have a solution.

Barry



On Jan 10, 2023, at 10:30 AM, Karthikeyan Chockalingam - STFC UKRI via 
petsc-users  wrote:

Hello,

I am assembling a MATIJ of size N, where a very large number of rows (and 
corresponding columns), are zeros. I would like to potentially eliminate them 
before the solve.

For instance say N=7

0 0  0  0 0 0 0
0 1 -1  0 0 0 0
0 -1 2  0 0 0 -1
0 0  0  0 0 0 0
0 0  0  0 0 0 0
0 0  0  0 0 0 0
0 0  -1 0 0 0 1

I would like to reduce it to a 3x3

1 -1 0
-1 2 -1
0 -1 1

I do know the size N.

Q1) How do I do it?
Q2) Is it better to eliminate them as it would save a lot of memory?
Q3) At the moment, I don’t know which rows (and columns) have the zero entries 
but with some effort I probably can find them. Should I know which rows (and 
columns) I am eliminating?

Thank you.

Karthik.

This email and any attachments are intended solely for the use of the named 
recipients. If you are not the intended recipient you must not use, disclose, 
copy or distribute this email or any of its attachments and should notify the 
sender immediately and delete this email from your system. UK Research and 
Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
this email or any attachments containing viruses or malware but the recipient 
should carry out its own virus and malware checks before opening the 
attachments. UKRI does not accept any liability for any losses or damages which 
the recipient may sustain due to presence of any viruses.



Re: [petsc-users] Eliminating rows and columns which are zeros

2023-01-10 Thread Barry Smith

https://petsc.org/release/docs/manualpages/PC/PCREDISTRIBUTE/#pcredistribute   
-pc_type redistribute


It does everything for you. Note that if the right hand side for any of the 
"zero" rows is nonzero then the system is inconsistent and the system does not 
have a solution.

Barry


> On Jan 10, 2023, at 10:30 AM, Karthikeyan Chockalingam - STFC UKRI via 
> petsc-users  wrote:
> 
> Hello,
>  
> I am assembling a MATIJ of size N, where a very large number of rows (and 
> corresponding columns), are zeros. I would like to potentially eliminate them 
> before the solve.
>  
> For instance say N=7
>  
> 0 0  0  0 0 0 0
> 0 1 -1  0 0 0 0
> 0 -1 2  0 0 0 -1
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  -1 0 0 0 1
>  
> I would like to reduce it to a 3x3
>  
> 1 -1 0
> -1 2 -1
> 0 -1 1
>  
> I do know the size N.
>  
> Q1) How do I do it?
> Q2) Is it better to eliminate them as it would save a lot of memory?
> Q3) At the moment, I don’t know which rows (and columns) have the zero 
> entries but with some effort I probably can find them. Should I know which 
> rows (and columns) I am eliminating?
>  
> Thank you.
>  
> Karthik.
> This email and any attachments are intended solely for the use of the named 
> recipients. If you are not the intended recipient you must not use, disclose, 
> copy or distribute this email or any of its attachments and should notify the 
> sender immediately and delete this email from your system. UK Research and 
> Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
> this email or any attachments containing viruses or malware but the recipient 
> should carry out its own virus and malware checks before opening the 
> attachments. UKRI does not accept any liability for any losses or damages 
> which the recipient may sustain due to presence of any viruses. 
> 



[petsc-users] Eliminating rows and columns which are zeros

2023-01-10 Thread Karthikeyan Chockalingam - STFC UKRI via petsc-users
Hello,

I am assembling a MATIJ of size N, where a very large number of rows (and 
corresponding columns), are zeros. I would like to potentially eliminate them 
before the solve.

For instance say N=7


0 0  0  0 0 0 0

0 1 -1  0 0 0 0

0 -1 2  0 0 0 -1

0 0  0  0 0 0 0

0 0  0  0 0 0 0

0 0  0  0 0 0 0

0 0  -1 0 0 0 1

I would like to reduce it to a 3x3

1 -1 0
-1 2 -1
0 -1 1

I do know the size N.

Q1) How do I do it?
Q2) Is it better to eliminate them as it would save a lot of memory?
Q3) At the moment, I don’t know which rows (and columns) have the zero entries 
but with some effort I probably can find them. Should I know which rows (and 
columns) I am eliminating?

Thank you.

Karthik.

This email and any attachments are intended solely for the use of the named 
recipients. If you are not the intended recipient you must not use, disclose, 
copy or distribute this email or any of its attachments and should notify the 
sender immediately and delete this email from your system. UK Research and 
Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
this email or any attachments containing viruses or malware but the recipient 
should carry out its own virus and malware checks before opening the 
attachments. UKRI does not accept any liability for any losses or damages which 
the recipient may sustain due to presence of any viruses.