[petsc-users] Status of PETScSF failures with GPU-aware MPI on Perlmutter

2023-11-02 Thread Sajid Ali
Hi PETSc-developers,

I had posted about crashes within PETScSF when using GPU-aware MPI on
Perlmutter a while ago (
https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2022-February/045585.html).
Now that the software stacks have stabilized, I was wondering if there was
a fix for the same as I am still observing similar crashes.

I am attaching the trace of the latest crash (with PETSc-3.20.0) for
reference.

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io


2_gpu_crash
Description: Binary data


Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-24 Thread Sajid Ali Syed via petsc-users
Hi Barry,

The application calls PetscCallAbort in a loop, i.e.

for i in range:
  void routine(PetscCallAbort(function_returning_petsc_error_code))

From the prior logs it looks like the stack grows every time PetscCallAbort is 
called (in other words, the stack does not shrink upon successful exit from 
PetscCallAbort).

Is this usage pattern not recommended? Should I be manually checking for 
success of the `function_returning_petsc_error_code` and throw instead of 
relying on PetscCallAbort?



Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Barry Smith 
Sent: Wednesday, February 22, 2023 6:49 PM
To: Sajid Ali Syed 
Cc: Matthew Knepley ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] KSP_Solve crashes in debug mode


  Hmm, there could be a bug in our handling of the stack when reaches the 
maximum. It is suppose to just stop collecting additional levels at that point 
but likely it has not been tested since a lot of refactorizations.

   What are you doing to have so many stack frames?

On Feb 22, 2023, at 6:32 PM, Sajid Ali Syed  wrote:

Hi Matt,

Adding `-checkstack` does not prevent the crash, both on my laptop and on the 
cluster.

What does prevent the crash (on my laptop at least) is changing 
`PETSCSTACKSIZE` from 64 to 256 here : 
https://github.com/petsc/petsc/blob/main/include/petscerror.h#L1153<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_petsc_petsc_blob_main_include_petscerror.h-23L1153=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=h95E7R5X17258LHwsaKi0qVASp22lBVFOsdrDZFvAOS2iJQd-5FGzfHgq68ShXYR=Rfmp69z-e_VacDf-D0n8jt0xA6qq7oRBfgFSgMn1Dj8=>


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=h95E7R5X17258LHwsaKi0qVASp22lBVFOsdrDZFvAOS2iJQd-5FGzfHgq68ShXYR=KDcd2SRT062jOa-0d8hvQywGEvYtyx9oHol5xp4XMI8=>


From: Matthew Knepley mailto:knep...@gmail.com>>
Sent: Wednesday, February 22, 2023 5:23 PM
To: Sajid Ali Syed mailto:sas...@fnal.gov>>
Cc: Barry Smith mailto:bsm...@petsc.dev>>; 
petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

On Wed, Feb 22, 2023 at 6:18 PM Sajid Ali Syed via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
One thing to note in relation to the trace attached in the previous email is 
that there are no warnings until the 36th call to KSP_Solve. The first error 
(as indicated by ASAN) occurs somewhere before the 40th call to KSP_Solve (part 
of what the application marks as turn 10 of the propagator). The crash finally 
occurs on the 43rd call to KSP_solve.

Looking at the trace, it appears that stack handling is messed up and 
eventually it causes the crash. This can happen when
PetscFunctionBegin is not matched up with PetscFunctionReturn. Can you try 
running this with

  -checkstack

  Thanks,

 Matt

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=oNWxB3zDYTHODeZK9VCibIVqSo7DnwsJjSr6IgIPs2M=>


From: Sajid Ali Syed mailto:sas...@fnal.gov>>
Sent: Wednesday, February 22, 2023 5:11 PM
To: Barry Smith mailto:bsm...@petsc.dev>>
Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

Hi Barry,

Thanks a lot for fixing this issue. I ran the same problem on a linux machine 
and have the following trace for the same crash (with ASAN turned on for both 
PETSc (on the latest commit of the branch) and the application) : 
https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940<https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_s-2Dsajid-2Dali_85bdf689eb8452ef8702c214c4df6940=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=Z8JyNKYXjUZE4DXYKvjxTOG4HZUA95U6z750WC6gUCo=>

The trace seems to indicate a couple of buffer overflows, one of which causes 
the crash. I'm not sure as to what causes them.

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefen

Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-22 Thread Sajid Ali Syed via petsc-users
Via a checkpoint in `PetscOptionsCheckInitial_Private`, I can confirm that 
`checkstack` is set to `PETSC_TRUE` and this leads to no (additional) 
information about erroneous stack handling.


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Sajid Ali Syed 
Sent: Wednesday, February 22, 2023 6:34 PM
To: Matthew Knepley 
Cc: Barry Smith ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

Hi Matt,

This is a trace from the same crash, but with `-checkstack` included in 
.petscrc​ : https://gist.github.com/s-sajid-ali/455b3982d47a31bff9e7ee211dd43991


I don't see any additional information regarding the possible cause.


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Matthew Knepley 
Sent: Wednesday, February 22, 2023 6:28 PM
To: Sajid Ali Syed 
Cc: Barry Smith ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

On Wed, Feb 22, 2023 at 6:32 PM Sajid Ali Syed 
mailto:sas...@fnal.gov>> wrote:
Hi Matt,

Adding `-checkstack` does not prevent the crash, both on my laptop and on the 
cluster.

It will not prevent a crash. The output is intended to show us where the stack 
problem originates. Can you send the output?

  Thanks,

Matt

What does prevent the crash (on my laptop at least) is changing 
`PETSCSTACKSIZE` from 64 to 256 here : 
https://github.com/petsc/petsc/blob/main/include/petscerror.h#L1153<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_petsc_petsc_blob_main_include_petscerror.h-23L1153=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=4O4e5ZFa8DyEfzEsYIIKt360UctxCt5KFvZxW811iBIqyNqK5hsslNaRmF9EKE7s=MnNgqJmNQ3g1IfdK3RejS4PSU0SSoTij4l0SxKbRCKM=>


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=4O4e5ZFa8DyEfzEsYIIKt360UctxCt5KFvZxW811iBIqyNqK5hsslNaRmF9EKE7s=8q2mKS3nKLeZnpV5HG37TS3K7ZYQ3hglpSFoDJHXJ3g=>


From: Matthew Knepley mailto:knep...@gmail.com>>
Sent: Wednesday, February 22, 2023 5:23 PM
To: Sajid Ali Syed mailto:sas...@fnal.gov>>
Cc: Barry Smith mailto:bsm...@petsc.dev>>; 
petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

On Wed, Feb 22, 2023 at 6:18 PM Sajid Ali Syed via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
One thing to note in relation to the trace attached in the previous email is 
that there are no warnings until the 36th call to KSP_Solve. The first error 
(as indicated by ASAN) occurs somewhere before the 40th call to KSP_Solve (part 
of what the application marks as turn 10 of the propagator). The crash finally 
occurs on the 43rd call to KSP_solve.

Looking at the trace, it appears that stack handling is messed up and 
eventually it causes the crash. This can happen when
PetscFunctionBegin is not matched up with PetscFunctionReturn. Can you try 
running this with

  -checkstack

  Thanks,

 Matt

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=oNWxB3zDYTHODeZK9VCibIVqSo7DnwsJjSr6IgIPs2M=>


From: Sajid Ali Syed mailto:sas...@fnal.gov>>
Sent: Wednesday, February 22, 2023 5:11 PM
To: Barry Smith mailto:bsm...@petsc.dev>>
Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

Hi Barry,

Thanks a lot for fixing this issue. I ran the same problem on a linux machine 
and have the following trace for the same crash (with ASAN turned on for both 
PETSc (on the latest commit of the branch) and the application) : 
https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940<https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_s-2Dsajid-2Dali_85bdf689eb8452ef8702c214c4df6940=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=Z8JyNKYXjUZE4DXYKvjxTOG4HZUA95U6z750WC6gUCo=>

The trace seems to indicate a couple of buffer overflows, one of w

Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-22 Thread Sajid Ali Syed via petsc-users
Hi Matt,

This is a trace from the same crash, but with `-checkstack` included in 
.petscrc​ : https://gist.github.com/s-sajid-ali/455b3982d47a31bff9e7ee211dd43991


I don't see any additional information regarding the possible cause.


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Matthew Knepley 
Sent: Wednesday, February 22, 2023 6:28 PM
To: Sajid Ali Syed 
Cc: Barry Smith ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

On Wed, Feb 22, 2023 at 6:32 PM Sajid Ali Syed 
mailto:sas...@fnal.gov>> wrote:
Hi Matt,

Adding `-checkstack` does not prevent the crash, both on my laptop and on the 
cluster.

It will not prevent a crash. The output is intended to show us where the stack 
problem originates. Can you send the output?

  Thanks,

Matt

What does prevent the crash (on my laptop at least) is changing 
`PETSCSTACKSIZE` from 64 to 256 here : 
https://github.com/petsc/petsc/blob/main/include/petscerror.h#L1153<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_petsc_petsc_blob_main_include_petscerror.h-23L1153=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=4O4e5ZFa8DyEfzEsYIIKt360UctxCt5KFvZxW811iBIqyNqK5hsslNaRmF9EKE7s=MnNgqJmNQ3g1IfdK3RejS4PSU0SSoTij4l0SxKbRCKM=>


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=4O4e5ZFa8DyEfzEsYIIKt360UctxCt5KFvZxW811iBIqyNqK5hsslNaRmF9EKE7s=8q2mKS3nKLeZnpV5HG37TS3K7ZYQ3hglpSFoDJHXJ3g=>


From: Matthew Knepley mailto:knep...@gmail.com>>
Sent: Wednesday, February 22, 2023 5:23 PM
To: Sajid Ali Syed mailto:sas...@fnal.gov>>
Cc: Barry Smith mailto:bsm...@petsc.dev>>; 
petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

On Wed, Feb 22, 2023 at 6:18 PM Sajid Ali Syed via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
One thing to note in relation to the trace attached in the previous email is 
that there are no warnings until the 36th call to KSP_Solve. The first error 
(as indicated by ASAN) occurs somewhere before the 40th call to KSP_Solve (part 
of what the application marks as turn 10 of the propagator). The crash finally 
occurs on the 43rd call to KSP_solve.

Looking at the trace, it appears that stack handling is messed up and 
eventually it causes the crash. This can happen when
PetscFunctionBegin is not matched up with PetscFunctionReturn. Can you try 
running this with

  -checkstack

  Thanks,

 Matt

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=oNWxB3zDYTHODeZK9VCibIVqSo7DnwsJjSr6IgIPs2M=>


From: Sajid Ali Syed mailto:sas...@fnal.gov>>
Sent: Wednesday, February 22, 2023 5:11 PM
To: Barry Smith mailto:bsm...@petsc.dev>>
Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

Hi Barry,

Thanks a lot for fixing this issue. I ran the same problem on a linux machine 
and have the following trace for the same crash (with ASAN turned on for both 
PETSc (on the latest commit of the branch) and the application) : 
https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940<https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_s-2Dsajid-2Dali_85bdf689eb8452ef8702c214c4df6940=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=Z8JyNKYXjUZE4DXYKvjxTOG4HZUA95U6z750WC6gUCo=>

The trace seems to indicate a couple of buffer overflows, one of which causes 
the crash. I'm not sure as to what causes them.

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=oNWxB3zDYTHODeZK9VCibIVqSo7DnwsJjSr6IgIPs2M=>

________
From: Barry Smith mailto:bsm...@petsc.dev>>
Sent: Wednesday, February 15, 2023 2:01 PM
To: Sajid Ali S

Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-22 Thread Sajid Ali Syed via petsc-users
Hi Matt,

Adding `-checkstack` does not prevent the crash, both on my laptop and on the 
cluster.

What does prevent the crash (on my laptop at least) is changing 
`PETSCSTACKSIZE` from 64 to 256 here : 
https://github.com/petsc/petsc/blob/main/include/petscerror.h#L1153


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Matthew Knepley 
Sent: Wednesday, February 22, 2023 5:23 PM
To: Sajid Ali Syed 
Cc: Barry Smith ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

On Wed, Feb 22, 2023 at 6:18 PM Sajid Ali Syed via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
One thing to note in relation to the trace attached in the previous email is 
that there are no warnings until the 36th call to KSP_Solve. The first error 
(as indicated by ASAN) occurs somewhere before the 40th call to KSP_Solve (part 
of what the application marks as turn 10 of the propagator). The crash finally 
occurs on the 43rd call to KSP_solve.

Looking at the trace, it appears that stack handling is messed up and 
eventually it causes the crash. This can happen when
PetscFunctionBegin is not matched up with PetscFunctionReturn. Can you try 
running this with

  -checkstack

  Thanks,

 Matt

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=oNWxB3zDYTHODeZK9VCibIVqSo7DnwsJjSr6IgIPs2M=>

________
From: Sajid Ali Syed mailto:sas...@fnal.gov>>
Sent: Wednesday, February 22, 2023 5:11 PM
To: Barry Smith mailto:bsm...@petsc.dev>>
Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

Hi Barry,

Thanks a lot for fixing this issue. I ran the same problem on a linux machine 
and have the following trace for the same crash (with ASAN turned on for both 
PETSc (on the latest commit of the branch) and the application) : 
https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940<https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_s-2Dsajid-2Dali_85bdf689eb8452ef8702c214c4df6940=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=Z8JyNKYXjUZE4DXYKvjxTOG4HZUA95U6z750WC6gUCo=>

The trace seems to indicate a couple of buffer overflows, one of which causes 
the crash. I'm not sure as to what causes them.

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=q5fD8r624Cr0Ow4AKTmgeLtq_M--q_KdGYMkBNiKOMDa8o82C8P97vdCRcxrqTCF=oNWxB3zDYTHODeZK9VCibIVqSo7DnwsJjSr6IgIPs2M=>


From: Barry Smith mailto:bsm...@petsc.dev>>
Sent: Wednesday, February 15, 2023 2:01 PM
To: Sajid Ali Syed mailto:sas...@fnal.gov>>
Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode


https://gitlab.com/petsc/petsc/-/merge_requests/6075<https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_6075=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX=QwRI_DzGnCHagpaQSC4MPPEUnC4aAkbMwdG1eg_QUII=>
 should fix the possible recursive error condition Matt pointed out


On Feb 9, 2023, at 6:24 PM, Matthew Knepley 
mailto:knep...@gmail.com>> wrote:

On Thu, Feb 9, 2023 at 6:05 PM Sajid Ali Syed via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:

I added “-malloc_debug” in a .petscrc file and ran it again. The backtrace from 
lldb is in the attached file. The crash now seems to be at:

Process 32660 stopped* thread #1, queue = 'com.apple.main-thread', stop reason 
= EXC_BAD_ACCESS (code=2, address=0x16f603fb8)
frame #0: 0x000112ecc8f8 libpetsc.3.018.dylib`PetscFPrintf(comm=0, 
fd=0x, format=0x) at mprint.c:601
   598   `PetscViewerASCIISynchronizedPrintf()`, 
`PetscSynchronizedFlush()`
   599  @*/
   600  PetscErrorCode PetscFPrintf(MPI_Comm comm, FILE *fd, const char 
format[], ...)
-> 601  {
   602   PetscMPIInt rank;
   603  
   604   PetscFunctionBegin;
(lldb) frame info
frame #0: 0x000112ecc8f8 libpetsc.3.018.

Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-22 Thread Sajid Ali Syed via petsc-users
One thing to note in relation to the trace attached in the previous email is 
that there are no warnings until the 36th call to KSP_Solve. The first error 
(as indicated by ASAN) occurs somewhere before the 40th call to KSP_Solve (part 
of what the application marks as turn 10 of the propagator). The crash finally 
occurs on the 43rd call to KSP_solve.


Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Sajid Ali Syed 
Sent: Wednesday, February 22, 2023 5:11 PM
To: Barry Smith 
Cc: petsc-users@mcs.anl.gov 
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode

Hi Barry,

Thanks a lot for fixing this issue. I ran the same problem on a linux machine 
and have the following trace for the same crash (with ASAN turned on for both 
PETSc (on the latest commit of the branch) and the application) : 
https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940

The trace seems to indicate a couple of buffer overflows, one of which causes 
the crash. I'm not sure as to what causes them.

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Barry Smith 
Sent: Wednesday, February 15, 2023 2:01 PM
To: Sajid Ali Syed 
Cc: petsc-users@mcs.anl.gov 
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode


https://gitlab.com/petsc/petsc/-/merge_requests/6075<https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_6075=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX=QwRI_DzGnCHagpaQSC4MPPEUnC4aAkbMwdG1eg_QUII=>
 should fix the possible recursive error condition Matt pointed out


On Feb 9, 2023, at 6:24 PM, Matthew Knepley  wrote:

On Thu, Feb 9, 2023 at 6:05 PM Sajid Ali Syed via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:

I added “-malloc_debug” in a .petscrc file and ran it again. The backtrace from 
lldb is in the attached file. The crash now seems to be at:

Process 32660 stopped* thread #1, queue = 'com.apple.main-thread', stop reason 
= EXC_BAD_ACCESS (code=2, address=0x16f603fb8)
frame #0: 0x000112ecc8f8 libpetsc.3.018.dylib`PetscFPrintf(comm=0, 
fd=0x, format=0x) at mprint.c:601
   598   `PetscViewerASCIISynchronizedPrintf()`, 
`PetscSynchronizedFlush()`
   599  @*/
   600  PetscErrorCode PetscFPrintf(MPI_Comm comm, FILE *fd, const char 
format[], ...)
-> 601  {
   602   PetscMPIInt rank;
   603  
   604   PetscFunctionBegin;
(lldb) frame info
frame #0: 0x000112ecc8f8 libpetsc.3.018.dylib`PetscFPrintf(comm=0, 
fd=0x, format=0x) at mprint.c:601
(lldb)


The trace seems to indicate some sort of infinite loop causing an overflow.

Yes, I have also seen this. What happens is that we have a memory error. The 
error is reported inside PetscMallocValidate()
using PetscErrorPrintf, which eventually calls PetscCallMPI, which calls 
PetscMallocValidate again, which fails. We need to
remove all error checking from the prints inside Validate.

  Thanks,

 Matt


PS: I'm using a arm64 mac, so I don't have access to valgrind.

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX=JA1u9AHcO8HqY5oCgbEy-ghtKRjURlRDwdmxP-9YJac=>


--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cse.buffalo.edu_-7Eknepley_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX=CdEZKWQbBYiD2pzU3Az_EDIGUTBNkNHwSoD2n_2098Y=>



Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-22 Thread Sajid Ali Syed via petsc-users
Hi Barry,

Thanks a lot for fixing this issue. I ran the same problem on a linux machine 
and have the following trace for the same crash (with ASAN turned on for both 
PETSc (on the latest commit of the branch) and the application) : 
https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940

The trace seems to indicate a couple of buffer overflows, one of which causes 
the crash. I'm not sure as to what causes them.

Thank You,
Sajid Ali (he/him) | Research Associate
Data Science, Simulation, and Learning Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Barry Smith 
Sent: Wednesday, February 15, 2023 2:01 PM
To: Sajid Ali Syed 
Cc: petsc-users@mcs.anl.gov 
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode


https://gitlab.com/petsc/petsc/-/merge_requests/6075<https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_6075=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX=QwRI_DzGnCHagpaQSC4MPPEUnC4aAkbMwdG1eg_QUII=>
 should fix the possible recursive error condition Matt pointed out


On Feb 9, 2023, at 6:24 PM, Matthew Knepley  wrote:

On Thu, Feb 9, 2023 at 6:05 PM Sajid Ali Syed via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:

I added “-malloc_debug” in a .petscrc file and ran it again. The backtrace from 
lldb is in the attached file. The crash now seems to be at:

Process 32660 stopped* thread #1, queue = 'com.apple.main-thread', stop reason 
= EXC_BAD_ACCESS (code=2, address=0x16f603fb8)
frame #0: 0x000112ecc8f8 libpetsc.3.018.dylib`PetscFPrintf(comm=0, 
fd=0x, format=0x) at mprint.c:601
   598   `PetscViewerASCIISynchronizedPrintf()`, 
`PetscSynchronizedFlush()`
   599  @*/
   600  PetscErrorCode PetscFPrintf(MPI_Comm comm, FILE *fd, const char 
format[], ...)
-> 601  {
   602   PetscMPIInt rank;
   603  
   604   PetscFunctionBegin;
(lldb) frame info
frame #0: 0x000112ecc8f8 libpetsc.3.018.dylib`PetscFPrintf(comm=0, 
fd=0x, format=0x) at mprint.c:601
(lldb)


The trace seems to indicate some sort of infinite loop causing an overflow.

Yes, I have also seen this. What happens is that we have a memory error. The 
error is reported inside PetscMallocValidate()
using PetscErrorPrintf, which eventually calls PetscCallMPI, which calls 
PetscMallocValidate again, which fails. We need to
remove all error checking from the prints inside Validate.

  Thanks,

 Matt


PS: I'm using a arm64 mac, so I don't have access to valgrind.

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX=JA1u9AHcO8HqY5oCgbEy-ghtKRjURlRDwdmxP-9YJac=>


--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cse.buffalo.edu_-7Eknepley_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX=CdEZKWQbBYiD2pzU3Az_EDIGUTBNkNHwSoD2n_2098Y=>



Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-09 Thread Sajid Ali Syed via petsc-users
I’ve also printed out the head struct in the debugger, and it looks like this:

(lldb) print (TRSPACE)*head(TRSPACE) $7 = {
  size = 16
  rsize = 16
  id = 12063
  lineno = 217
  filename = 0x0001167fd865 
"/Users/sasyed/Documents/packages/petsc/src/sys/dll/reg.c"
  functionname = 0x0001167fde78 "PetscFunctionListDLAllPop_Private"
  classid = -253701943
  stack = {
function = {
  [0] = 0x00010189e2da "apply_bunch"
  [1] = 0x00010189e2da "apply_bunch"
  [2] = 0x00010189e2da "apply_bunch"
  [3] = 0x00010189e2da "apply_bunch"
  [4] = 0x00010189e2da "apply_bunch"
  [5] = 0x00010189e2da "apply_bunch"
  [6] = 0x00010189e2da "apply_bunch"
  [7] = 0x00010189e2da "apply_bunch"
  [8] = 0x00010189e2da "apply_bunch"
  [9] = 0x00010189e2da "apply_bunch"
  [10] = 0x00010189e2da "apply_bunch"
  [11] = 0x00010189e2da "apply_bunch"
  [12] = 0x00010189e2da "apply_bunch"
  [13] = 0x00010189e2da "apply_bunch"
  [14] = 0x00010189e2da "apply_bunch"
  [15] = 0x00010189e2da "apply_bunch"
  [16] = 0x00010189e2da "apply_bunch"
  [17] = 0x00010189e2da "apply_bunch"
  [18] = 0x00010189e2da "apply_bunch"
  [19] = 0x00010189e2da "apply_bunch"
  [20] = 0x00010189e2da "apply_bunch"
  [21] = 0x00010189e2da "apply_bunch"
  [22] = 0x00010189e2da "apply_bunch"
  [23] = 0x00010189e2da "apply_bunch"
  [24] = 0x00010189e2da "apply_bunch"
  [25] = 0x00010189e2da "apply_bunch"
  [26] = 0x00010189e2da "apply_bunch"
  [27] = 0x00010189e2da "apply_bunch"
  [28] = 0x00010189e2da "apply_bunch"
  [29] = 0x00010189e2da "apply_bunch"
  [30] = 0x00010189e2da "apply_bunch"
  [31] = 0x00010189e2da "apply_bunch"
  [32] = 0x00010189e2da "apply_bunch"
  [33] = 0x00010189e2da "apply_bunch"
  [34] = 0x00010189e2da "apply_bunch"
  [35] = 0x00010189e2da "apply_bunch"
  [36] = 0x00010189e2da "apply_bunch"
  [37] = 0x00010189e2da "apply_bunch"
  [38] = 0x00010189e2da "apply_bunch"
  [39] = 0x00010189e2da "apply_bunch"
  [40] = 0x00010189e2da "apply_bunch"
  [41] = 0x00010189e2da "apply_bunch"
  [42] = 0x00010189e2da "apply_bunch"
  [43] = 0x00010189e2da "apply_bunch"
  [44] = 0x00010189e2da "apply_bunch"
  [45] = 0x00010189e2da "apply_bunch"
  [46] = 0x00010189ebba "compute_mat"
  [47] = 0x00010189f0c3 "solve"
  [48] = 0x0001168b834c "KSPSolve"
  [49] = 0x0001168b89f7 "KSPSolve_Private"
  [50] = 0x0001168b395b "KSPSolve_GMRES"
  [51] = 0x0001168b37f8 "KSPGMRESCycle"
  [52] = 0x0001168ae4a7 "KSP_PCApplyBAorAB"
  [53] = 0x000116891b38 "PCApplyBAorAB"
  [54] = 0x0001168917ec "PCApply"
  [55] = 0x0001168a5337 "PCApply_MG"
  [56] = 0x0001168a5342 "PCApply_MG_Internal"
  [57] = 0x0001168a42e1 "PCMGMCycle_Private"
  [58] = 0x0001168b834c "KSPSolve"
  [59] = 0x0001168b89f7 "KSPSolve_Private"
  [60] = 0x00011682e396 "VecDestroy"
  [61] = 0x00011682d58e "VecDestroy_Seq"
  [62] = 0x0001168093fe "PetscObjectComposeFunction_Private"
  [63] = 0x000116809338 "PetscObjectComposeFunction_Petsc"
}
file = {
  [0] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [1] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [2] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [3] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [4] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [5] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [6] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [7] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [8] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [9] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [10] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [11] = 0x00010189e27f 
"/Users/sasyed/Documents/packages/synergia2/src/synergia/collective/space_charge_3d_fd.cc"
  [12] = 0x00010189e27f 

Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-09 Thread Sajid Ali Syed via petsc-users
Hi Barry,

The lack of line numbers is due to the fact that this build of PETSc was done 
via spack which installs it in a temporary directory before moving it to the 
final location.

I have removed that build and installed PETSc locally (albeit with a simpler 
configuration) and see the same bug. Logs for this configuration and the error 
trace with this build are attached with this email.

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Barry Smith 
Sent: Thursday, February 9, 2023 12:02 PM
To: Sajid Ali Syed 
Cc: petsc-users@mcs.anl.gov 
Subject: Re: [petsc-users] KSP_Solve crashes in debug mode


  Hmm, looks like your build may be funny? It is not in debug mode

frame #2: 0x00010eda20c8 libpetsc.3.018.dylib`PetscHeaderDestroy_Private + 
1436
frame #3: 0x00010f10176c libpetsc.3.018.dylib`VecDestroy + 808
frame #4: 0x000110199f34 libpetsc.3.018.dylib`KSPSolve_Private + 512

In debugger mode it would show the line numbers where the crash occurred and 
help us determine the problem. I do note the -g being used by the compilers so 
cannot explain off hand why it does not display the debug information.

  Barry


On Feb 9, 2023, at 12:42 PM, Sajid Ali Syed via petsc-users 
 wrote:


Hi PETSc-developers,

In our application we call KSP_Solve as part of a step to propagate a beam 
through a lattice. I am observing a crash within KSP_Solve for an application 
only after the 43rd call to KSP_Solve when building the application and PETSc 
in debug mode, full logs for which are attached with this email (1 MPI rank and 
4 OMP threads were used, but this crash occurs with multiple MPI ranks as well 
). I am also including the last few lines of the configuration for this build. 
This crash does not occur when building the application and PETSc in release 
mode.

Could someone tell me what causes this crash and if anything can be done to 
prevent it? Thanks in advance.

The configuration of this solver is here :  
https://github.com/fnalacceleratormodeling/synergia2/blob/sajid/features/openpmd_basic_integration/src/synergia/collective/space_charge_3d_fd_utils.cc#L273-L292<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_fnalacceleratormodeling_synergia2_blob_sajid_features_openpmd-5Fbasic-5Fintegration_src_synergia_collective_space-5Fcharge-5F3d-5Ffd-5Futils.cc-23L273-2DL292=DwMFAg=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=fZRIP-srEU-rOVw461XeAmMY4_3VyZ1mujQae_3pw33K9hMFg-5yyo_sXsQJ_1dn=YToKoRAfim51m6aN-z4FlzfM0UDy46D7QJh8B11shTg=>

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io_=DwMFAg=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=fZRIP-srEU-rOVw461XeAmMY4_3VyZ1mujQae_3pw33K9hMFg-5yyo_sXsQJ_1dn=g4Ch_bfIjw1n_m0d0zSXYc_eDI-4eJc9zSA_RfhUFOQ=>





configure_log_tail_local_install
Description: configure_log_tail_local_install


ksp_crash_log_local_install
Description: ksp_crash_log_local_install


Re: [petsc-users] KSP_Solve crashes in debug mode

2023-02-09 Thread Sajid Ali Syed via petsc-users
The configuration log is attached with this email.





configure_log_tail
Description: configure_log_tail


[petsc-users] KSP_Solve crashes in debug mode

2023-02-09 Thread Sajid Ali Syed via petsc-users
Hi PETSc-developers,

In our application we call KSP_Solve as part of a step to propagate a beam 
through a lattice. I am observing a crash within KSP_Solve for an application 
only after the 43rd call to KSP_Solve when building the application and PETSc 
in debug mode, full logs for which are attached with this email (1 MPI rank and 
4 OMP threads were used, but this crash occurs with multiple MPI ranks as well 
). I am also including the last few lines of the configuration for this build. 
This crash does not occur when building the application and PETSc in release 
mode.

Could someone tell me what causes this crash and if anything can be done to 
prevent it? Thanks in advance.

The configuration of this solver is here : 
https://github.com/fnalacceleratormodeling/synergia2/blob/sajid/features/openpmd_basic_integration/src/synergia/collective/space_charge_3d_fd_utils.cc#L273-L292

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>

​


ksp_crash_log
Description: ksp_crash_log


[petsc-users] Regarding the status of MatSolve on GPUs

2022-10-06 Thread Sajid Ali
Hi PETSc-developers,

Does PETSc currently provide (either native or third party support) for
MatSolve that can be performed entirely on a GPU given a factored matrix?
i.e. a direct solver that would store the factors L and U on the device and
use the GPU to solve the linear system. It does not matter if the GPU is
not used for the factorization as we intend to solve the same linear system
for 100s of iterations and thus try to prevent GPU->CPU transfers for the
MatSolve phase.

Currently, I've built PETSc@main (commit 9c433d, 10/03) with
superlu-dist@develop, both of which are configured with CUDA. With this,
I'm seeing that each call to PCApply/MatSolve involves one GPU->CPU
transfer. Is it possible to avoid this?

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io


Re: [petsc-users] PetscLogView produces nan's instead of timing data when using GPUs

2022-10-06 Thread Sajid Ali
Hi Barry,

Thanks for the explanation.

On Wed, Oct 5, 2022 at 4:11 PM Barry Smith  wrote:

>
>   It prints Nan to indicate that the time for that event is not known
> accurately. But the times for the larger events that contain these events
> are known. So for example the time for KSPSolve is know but not the time
> for VecNorm.  The other numbers in the events, like number of times called
> etc that are not Nan are correct as displayed.
>
>   This is done because correctly tracking the times of the individual
> events requires synchronizations that slow down the entire calculation a
> bit; for example the time for the KSPSolve will register a longer time then
> it registers if the smaller events are not timed.
>
>   To display the times of the smaller events use -log_view_gpu_time also
> but note this will increase the times of the larger events a bit.
>
>   Barry
>
>
> On Oct 5, 2022, at 4:47 PM, Sajid Ali 
> wrote:
>
> Hi PETSc-developers,
>
> I'm having trouble with getting performance logs from an application that
> uses PETSc. There are no issues when I run it on a CPU, but every time a
> GPU is used there is no timing data and almost all times are replaced by
> times that are just `nan` (on two different clusters). I am attaching the
> log files for both cases with this email. Could someone explain what is
> happening here ?
>
> In case it helps, here are the routines used to initialize/finalize the
> application that also handle initializing/finalizing PETSc and printing the
> PETSc performance logs to PETSC_VIEWER_STDOUT_WORLD :
> https://github.com/fnalacceleratormodeling/synergia2/blob/devel3/src/synergia/utils/utils.h
>
> Thank You,
> Sajid Ali (he/him) | Research Associate
> Scientific Computing Division
> Fermi National Accelerator Laboratory
> s-sajid-ali.github.io
> 
>
>
>

-- 
Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io


[petsc-users] PetscLogView produces nan's instead of timing data when using GPUs

2022-10-05 Thread Sajid Ali
Hi PETSc-developers,

I'm having trouble with getting performance logs from an application that
uses PETSc. There are no issues when I run it on a CPU, but every time a
GPU is used there is no timing data and almost all times are replaced by
times that are just `nan` (on two different clusters). I am attaching the
log files for both cases with this email. Could someone explain what is
happening here ?

In case it helps, here are the routines used to initialize/finalize the
application that also handle initializing/finalizing PETSc and printing the
PETSc performance logs to PETSC_VIEWER_STDOUT_WORLD :
https://github.com/fnalacceleratormodeling/synergia2/blob/devel3/src/synergia/utils/utils.h

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io


log-gpu
Description: Binary data


log-cpu
Description: Binary data


Re: [petsc-users] Regarding the status of VecSetValues(Blocked) for GPU vectors

2022-03-18 Thread Sajid Ali Syed
Hi Matt/Mark,

I'm working on a Poisson solver for a distributed PIC code, where the particles 
are distributed over MPI ranks rather than the grid. Prior to the solve, all 
particles are deposited onto a (DMDA) grid.

The current prototype I have is that each rank holds a full size DMDA vector 
and particles on that rank are deposited into it. Then, the data from all the 
local vectors in combined into multiple distributed DMDA vectors via 
VecScatters and this is followed by solving the Poisson equation. The need to 
have multiple subcomms, each solving the same equation is due to the fact that 
the grid size too small to use all the MPI ranks (beyond the strong scaling 
limit). The solution is then scattered back to each MPI rank via VecScatters.

This first local-to-(multi)global transfer required the use of multiple 
VecScatters as there is no one-to-multiple scatter capability in SF. This works 
and is already giving a large speedup over the current allreduce baseline 
(which transfers more data than is necessary) which is currently used.

I was wondering if within each subcommunicator I could directly write to the 
DMDA vector via VecSetValues and PETSc would take care of stashing them on the 
GPU until I call VecAssemblyBegin. Since this would be from within a kokkos 
parallel_for operation, there would be multiple (probably ~1e3) simultaneous 
writes that the stashing mechanism would have to support. Currently, we use 
Kokkos-ScatterView to do this.

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Matthew Knepley 
Sent: Thursday, March 17, 2022 7:25 PM
To: Mark Adams 
Cc: Sajid Ali Syed ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] Regarding the status of VecSetValues(Blocked) for 
GPU vectors

On Thu, Mar 17, 2022 at 8:19 PM Mark Adams 
mailto:mfad...@lbl.gov>> wrote:
LocalToGlobal is a DM thing..
Sajid, do use DM?
If you need to add off procesor entries then DM could give you a local vector 
as Matt said that you can add to for off procesor values and then you could use 
the CPU communication in DM.

It would be GPU communication, not CPU.

   Matt

On Thu, Mar 17, 2022 at 7:19 PM Matthew Knepley 
mailto:knep...@gmail.com>> wrote:
On Thu, Mar 17, 2022 at 4:46 PM Sajid Ali Syed 
mailto:sas...@fnal.gov>> wrote:
Hi PETSc-developers,

Is it possible to use VecSetValues with distributed-memory CUDA & Kokkos 
vectors from the device, i.e. can I call VecSetValues with GPU memory pointers 
and expect PETSc to figure out how to stash on the device it until I call 
VecAssemblyBegin (at which point PETSc could use GPU-aware MPI to populate 
off-process values) ?

If this is not currently supported, is supporting this on the roadmap? Thanks 
in advance!

VecSetValues() will fall back to the CPU vector, so I do not think this will 
work on device.

Usually, our assembly computes all values and puts them in a "local" vector, 
which you can access explicitly as Mark said. Then
we call LocalToGlobal() to communicate the values, which does work directly on 
device using specialized code in VecScatter/PetscSF.

What are you trying to do?

  THanks,

  Matt

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=0XVS3DAGXcfL8rzL8Bij70rxbfVtLXqvZC2kUPVoHUEquwZVQwgoBP_aHbei5owb=jaqSeHVty0Q2rK0mKuKQMyvcQGtqdOPN6wcZIGZ5_K4=>



--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cse.buffalo.edu_-7Eknepley_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=0XVS3DAGXcfL8rzL8Bij70rxbfVtLXqvZC2kUPVoHUEquwZVQwgoBP_aHbei5owb=CoW4LB9JyQtsc-D24RRWHnnDdNjSnjwZ4FPrLmaIZhc=>


--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cse.buffalo.edu_-7Eknepley_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=0XVS3DAGXcfL8rzL8Bij70rxbfVtLXqvZC2kUPVoHUEquwZVQwgoBP_aHbei5owb=CoW4LB9JyQtsc-D24RRWHnnDdNjSnjwZ4FPrLmaIZhc=>


[petsc-users] Regarding the status of VecSetValues(Blocked) for GPU vectors

2022-03-17 Thread Sajid Ali Syed
Hi PETSc-developers,

Is it possible to use VecSetValues with distributed-memory CUDA & Kokkos 
vectors from the device, i.e. can I call VecSetValues with GPU memory pointers 
and expect PETSc to figure out how to stash on the device it until I call 
VecAssemblyBegin (at which point PETSc could use GPU-aware MPI to populate 
off-process values) ?

If this is not currently supported, is supporting this on the roadmap? Thanks 
in advance!

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>



Re: [petsc-users] GAMG crash during setup when using multiple GPUs

2022-02-11 Thread Sajid Ali Syed
Hi Mark,

Thanks for the information.

@Junchao: Given that there are known issues with GPU aware MPI, it might be 
best to wait until there is an updated version of cray-mpich (which hopefully 
contains the relevant fixes).

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>


From: Mark Adams 
Sent: Thursday, February 10, 2022 8:47 PM
To: Junchao Zhang 
Cc: Sajid Ali Syed ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] GAMG crash during setup when using multiple GPUs

Perlmutter has problems with GPU aware MPI.
This is being actively worked on at NERSc.

Mark

On Thu, Feb 10, 2022 at 9:22 PM Junchao Zhang 
mailto:junchao.zh...@gmail.com>> wrote:
Hi, Sajid Ali,
  I have no clue. I have access to perlmutter.  I am thinking how to debug that.
  If your app is open-sourced and easy to build, then I can build and debug it. 
Otherwise, suppose you build and install petsc (only with options needed by 
your app) to a shared directory, and I can access your executable (which uses 
RPATH for libraries), then maybe I can debug it (I only need to install my own 
petsc to the shared directory)

--Junchao Zhang


On Thu, Feb 10, 2022 at 6:04 PM Sajid Ali Syed 
mailto:sas...@fnal.gov>> wrote:
Hi Junchao,

With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for this 
case with this email.

I also ran with gpu aware mpi to see if I could reproduce the error and got the 
error but from a different location. This logfile is also attached.

This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me know 
if I can share further information to help with debugging this.

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=Fea4VIbc4UoqdTFjAk3kg3Hp94LYXkjR3gHIdP08lMeT-3zEDZNKDcHjRejBIggW=ezCw13eIYUcCzUki3rlnpGZWZrdcTxlGpG57GqrEz_s=>


From: Junchao Zhang mailto:junchao.zh...@gmail.com>>
Sent: Thursday, February 10, 2022 1:43 PM
To: Sajid Ali Syed mailto:sas...@fnal.gov>>
Cc: petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov> 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] GAMG crash during setup when using multiple GPUs

Also, try "-use_gpu_aware_mpi 0" to see if there is a difference.

--Junchao Zhang


On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang 
mailto:junchao.zh...@gmail.com>> wrote:
Did it fail without GPU at 64 MPI ranks?

--Junchao Zhang


On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed 
mailto:sas...@fnal.gov>> wrote:

Hi PETSc-developers,

I’m seeing the following crash that occurs during the setup phase of the 
preconditioner when using multiple GPUs. The relevant error trace is shown 
below:

(GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, 
CUDA_ERROR_ALREADY_MAPPED, line no 272
[24]PETSC ERROR: - Error Message 
--
[24]PETSC ERROR: General MPI error
[24]PETSC ERROR: MPI error 1 Invalid buffer pointer
[24]PETSC ERROR: See 
https://petsc.org/release/faq/<https://urldefense.proofpoint.com/v2/url?u=https-3A__petsc.org_release_faq_=DwMFaQ=gRgGjJ3BkIsb5y6s49QqsA=w-DPglgoOUOz8eiEyHKz0g=3AFKDE-HT__MEeFxdxlc6bMDLLjchFccw_htjVmWkOsApaEairnUJYnKT28tfsiN=ZpvtorGvQdUD8O-wLBTUYUUb6-Kccver8Cc4kXlZ7J0=>
 for trouble shooting.
[24]PETSC ERROR: Petsc Development GIT revision: 
f351d5494b5462f62c419e00645ac2e477b88cae  GIT Date: 2022-02-08 15:08:19 +
...
[24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54
[24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274
[24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218
[24]PETSC ERROR: #4 PetscSFBcastEnd() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499
[24]PETSC ERROR: #5 VecScatterEnd_Internal() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87
[24]PETSC ERROR: #6 VecScatterEnd() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366
[24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at 
/tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6k

[petsc-users] Sparse solvers for distributed GPU matrices/vectors arising from 3D poisson eq

2022-02-04 Thread Sajid Ali Syed
Hi PETSc-developers,

Could the linear solver table (at 
https://petsc.org/main/overview/linear_solve_table/) be updated with 
information regarding direct solvers that work on mpiaijkokkos/kokkos (or 
mpiaijcusparse/cuda) matrix/vector types?

The use case for this solver would be to repeatedly invert the same matrix so 
any solver that is able to perform the SpTRSV phase entirely using GPU 
matrices/vectors would be helpful (even if the initial factorization is 
performed using CPU matrices/vectors with GPU offload), this functionality of 
course being the corresponding distributed memory counterpart to the current 
device-solve capabilities of the seqaijkokkos matrix type (provided by the 
kokkos-kernel SpTRSV routines). The system arises from a (7-pt) finite 
difference discretization of the 3D Poisson equation with a mesh of 
256x256x1024 (likely necessitate using multiple GPUs) with dirichlet boundary 
conditions.

The recent article on PETScSF (arXiv:2102.13018) describes an asynchronous CG 
solver that works well on communication bound multi-GPU systems. Is this solver 
available now and can it be combined with GAMG/hypre preconditioning ?

Summary of Sparse Linear Solvers Available In PETSc — PETSc 
v3.16.2-540-g1213a6437a 
documentation<https://petsc.org/main/overview/linear_solve_table/>
Last updated on 2022-01-01T03:38:46-0600 (v3.16.2-540-g1213a6437a).
petsc.org

Thank You,
Sajid Ali (he/him) | Research Associate
Scientific Computing Division
Fermi National Accelerator Laboratory
s-sajid-ali.github.io<http://s-sajid-ali.github.io>



Re: [petsc-users] petsc-users Digest, Vol 149, Issue 47

2021-05-18 Thread Sajid Ali
 You could use VecPermulte and MatPermute from the PETSc API to permute the
vectors/matrices. However, MatPermute creates a new matrix and even though
VecPermute permutes the vector (locally) in-place, it allocates a temporary
array and frees the original array.

Since you are working on dense matrices and vectors and desire to avoid
temporary allocation, you could use BLAS level 1 swap function (used by
VecSwap to swap two vectors) which will probably be the most optimized
version for the hardware you're using (since it's implemented by platform
specific intrinsics and assembly).



--
>
> Message: 1
> Date: Tue, 18 May 2021 11:44:36 -0500
> From: Barry Smith 
> To: Roland Richter 
> Cc: PETSc 
> Subject: Re: [petsc-users] Efficient FFTShift-implementation for
> vectors/matrices
> Message-ID: <83012843-a364-4ce1-a92b-1768eb115...@petsc.dev>
> Content-Type: text/plain; charset="us-ascii"
>
>
>   I found a variety of things on the web, below. I don't understand this
> but for the even case it seems one simply modifies the input matrix before
> the FFT http://www.fftw.org/faq/section3.html#centerorigin
>
>
> https://stackoverflow.com/questions/5915125/fftshift-ifftshift-c-c-source-code
> <
> https://stackoverflow.com/questions/5915125/fftshift-ifftshift-c-c-source-code
> >
>
>   https://www.dsprelated.com/showthread/comp.dsp/20790-1.php <
> https://www.dsprelated.com/showthread/comp.dsp/20790-1.php>
>
>
>
> > On May 18, 2021, at 9:48 AM, Roland Richter 
> wrote:
> >
> > Dear all,
> >
> > I tried to implement the function fftshift from numpy (i.e. swap the
> > half-spaces of all axis) for row vectors in a matrix by using the
> > following code
> >
> > void fft_shift(Mat _matrix) {
> > PetscScalar *mat_ptr;
> > MatDenseGetArray (fft_matrix, _ptr);
> > PetscInt r_0, r_1;
> > MatGetOwnershipRange(fft_matrix, _0, _1);
> > PetscInt local_row_num = r_1 - r_0;
> > arma::cx_mat temp_mat(local_row_num, Ntime, arma::fill::zeros);
> > for(int i = 0; i < Ntime; ++i) {
> > const PetscInt row_shift = i * local_row_num;
> > for(int j = 0; j < local_row_num; ++j) {
> > const PetscInt cur_pos = j + row_shift;
> > if(i < (int)(Ntime / 2))
> > temp_mat(j, i + int(Ntime / 2)) = *(mat_ptr +
> cur_pos);
> > else
> > temp_mat(j, i - int(Ntime / 2)) = *(mat_ptr +
> cur_pos);
> > }
> > }
> > for(int i = 0; i < Ntime; ++i) {
> > const PetscInt row_shift = i * local_row_num;
> > for(int j = 0; j < local_row_num; ++j) {
> > const PetscInt cur_pos = j + row_shift;
> > *(mat_ptr + cur_pos) = temp_mat(j, i);
> > }
> > }
> > MatDenseRestoreArray (fft_matrix, _ptr);
> > }
> >
> > but I do not like the approach of having a second matrix as temporary
> > storage space. Are there more efficient approaches possible using
> > PETSc-functions?
> >
> > Thanks!
> >
> > Regards,
> >
> > Roland Richter
> >
>
> -- next part --
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210518/b8710455/attachment-0001.html
> >
>
>

-- 
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Convert a 3D DMDA sub-vector to a natural 2D vector

2021-01-25 Thread Sajid Ali
Hi Randall,

Thanks for providing a pointer to the DMDAGetRay function!

After looking at its implementation, I came up with a solution that creates
a natural ordered slice vector on the same subset of processors as the DMDA
ordered slice vector (by scattering from the DMDA order slice to a natural
ordered slice by using an AO associated with a temporary 2D DMDA object
that lives only on the subset of ranks where the slice vector lives). I've
attached the code for the same should it be of interest to anyone who reads
this.

--
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


ex_slice_nat.c
Description: Binary data


[petsc-users] Convert a 3D DMDA sub-vector to a natural 2D vector

2021-01-23 Thread Sajid Ali
Hi PETSc-developers,

For an application, I'd like to extract a 2D slice from a 3D DMDA vector,
perform a `MatMult` on it (a discretized rotation on the 2D vector) and
place the resulting vector back into the 3D vector. The approach I'd taken
was to use `DMDACreatePatchIS` to create an IS that selects the slice (say
this slice is [:,:,n_z] with no loss of generality). This IS is then used
to extract the 2D slice via the `VecGetSubVector` utility. The issue that
arises from this scheme is that the extracted 2D vector does not represent
a flattened 2D vector, it is instead ordered as a DMDA vector (aka each
rank having the portion of vector it owns in a column major ordering). How
do I convert this DMDA ordered 2D subvector to a vector that represents a
flattened 2D array? Since I can't predict how the extracted sub-vector

While PETSc provides `DMDAGlobalToNatural` routines, those don't apply to
extracted sub-vectors and extracting the 2D slice from a natural vector
(filled by a scatter from the global vector) does not work either. On
the other hand, converting the output of DMDACreatePatchIS to a natural IS
via `AOPetscToApplicationIS` before extracting a subvector extracts the
wrong indices from the 3D vector.

Could someone tell me how I can achieve the desired goal (converting a 2D
vector ordered as per a 3D DMDA grid onto a natural flattened index) ?
Thanks in advance for the help!

PS : Should it help, I'm attaching a slightly modified version of
`src/dm/tests/ex53.c` to demonstrate the issue. The array I wish to obtain
(the [:,:,2]) slice is : `[196., 194., 192., 190., 198., 197., 196., 195.,
200., 200., 200.,200., 202., 203., 204., 205.]` (a row-major flattening of
the 2D slice or alternatively a column-major flattening). What I obtain
from the program (by creating the PatchIS and extracting the subvector)
instead is : `[0 (A)] 196,198,194,197 [1 (B)] 200,202,200,203 [2 (C)]
192,196,190,195 [3 (D)] 200,204,200,205` (where each square bracket
represents the rank and portion of the 2D vector being flattened via a
column-major formatting with the layout shown below)
|-|
| A | C  |
| B | D  |
|-|


Thank You,
Sajid Ali (he/him) | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


ex53.c
Description: Binary data


Re: [petsc-users] Regarding changes in the 3.14 release

2020-10-28 Thread Sajid Ali
Hi Matt,

Thanks for the clarification. The documentation
<https://gitlab.com/petsc/petsc/-/blob/master/src/snes/interface/snes.c#L3304>
for SNESSetLagPreconditioner states "If  -1 is used before the very first
nonlinear solve the preconditioner is still built because there is no
previous preconditioner to use" which was true prior to 3.14, is this
statement no longer valid ?

What is the difference between having -snes_lag_preconditioner -2 and
having -snes_lag_preconditioner_persists true ?

PS :  The man pages for SNESSetLagJacobianPersists should perhaps not state
the lag preconditioner options database keys and vice versa for clarity.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


[petsc-users] Regarding changes in the 3.14 release

2020-10-28 Thread Sajid Ali
Hi PETSc-developers,

I have a few questions regarding changes to PETSc between version 3.13.5
and current master. I’m trying to run an application that worked with no
issues with version 3.13.5 but isn’t working with the current master.

[1] To assemble a matrix in this application I loop over all rows and have
multiple calls to MatSetValuesStencil with INSERT_VALUES as the addv
argument for all except one call which has ADD_VALUES. Final assembly is
called after this loop. With PETSc-3.13.5 this ran with no errors but with
PETSc-master I get :

Object is in wrong state
[0]PETSC ERROR: Cannot mix add values and insert values

This is fixed by having a flush assembly in between two stages where the
first stage has two loops with INSERT_VALUES and the second stage has a
loop with ADD_VALUES.

Did this change result from a bugfix or are users now expected to no longer
mix add and insert values within the same loop ?

[2] To prevent re-building the preconditioner at all TSSteps, I had the
command line argument -snes_lag_preconditioner -1. This did the job in
3.13.5 but with the current master I get the following error :

Cannot set the lag to -1 from the command line since the
preconditioner must be built as least once, perhaps you mean -2

I can however run the application without this option. If this is a
breaking change, what is the new option to prevent re-building the
preconditioner ?

[3] Finally, I’m used the latest development version of MPICH for building
both 3.13.5 and petsc-master and I get these warnings at exit :

[WARNING] yaksa: 2 leaked handles
 (repeated N number of times where N is number of mpi ranks)

Can this be safely neglected ?

Let me know if sharing either the application code and/or logs would be
helpful and I can share either.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Question about MatGetRowMax

2020-09-08 Thread Sajid Ali
Hi Hong,

A related bugfix is that lines 2444 and 2447 from
src/mat/impls/aij/mpi/mpiaij.c in the current petsc-master are missing a
check for validity of idx. Adding a check ( if (idx)  ...) before accessing
the entries of idx might be necessary (since the docs say that the idx
argument is optional).

Thanks for the insight into the cause of this bug.

-- 
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


[petsc-users] Question about MatGetRowMax

2020-09-08 Thread Sajid Ali
Hi PETSc-developers,

While trying to use MatGetRowMax, I’m getting the following error :

[0]PETSC ERROR:

[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see
https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: -  Stack Frames

[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:   INSTEAD the line number of the start of the function
[0]PETSC ERROR:   is given.
[0]PETSC ERROR: [0] MatGetRowMax_SeqAIJ line 3182
/home/sajid/packages/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: [0] MatGetRowMax line 4798
/home/sajid/packages/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: [0] MatGetRowMax_MPIAIJ line 2432
/home/sajid/packages/petsc/src/mat/impls/aij/mpi/mpiaij.c
[0]PETSC ERROR: [0] MatGetRowMax line 4798
/home/sajid/packages/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: [0] construct_matrix line 25
/home/sajid/Documents/intern/pirt/src/matrix.cxx
[0]PETSC ERROR: - Error Message
--
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See
https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
shooting.
[0]PETSC ERROR: Petsc Development GIT revision:
v3.13.5-2756-g0264f47704  GIT Date: 2020-09-06 15:08:48 -0500
[0]PETSC ERROR: /home/sajid/Documents/intern/pirt/src/pirt on a
arch-linux-c-debug named xrm-backup by sajid Tue Sep  8 13:56:53 2020
[0]PETSC ERROR: Configure options --with-hdf5=1 --with-debugging=yes
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 50161059) - process 0

Could someone point out what could cause such a segfault ?

PS : Should it be useful, the error occurs in the (nested) call to
MatGetRowMax for the off-diagonal SeqAIJ matrix and the segmentation
violation occurs for the first row of a matrix whose ncols=0 (for the off
diagonal part).

MatGetOwnershipRangeColumn was used to set the diagonal and off-diagonal
preallocation and all the columns were set to be in the diagonal SeqAIJ
matrix as shown below :

(gdb) frame
#0  MatGetRowMax (mat=0x7ed340, v=0x982470, idx=0x0) at
/home/sajid/packages/petsc/src/mat/interface/matrix.c:4803
4803  if (!mat->ops->getrowmax)
SETERRQ1(PetscObjectComm((PetscObject)mat),PETSC_ERR_SUP,"Mat type
%s",((PetscObject)mat)->type_name);
(gdb) print mat->cmap->rstart
$12 = 0
(gdb) print mat->cmap->rend
$13 = 65536
(gdb) step
4804  MatCheckPreallocated(mat,1);
(gdb) next
4806  ierr = (*mat->ops->getrowmax)(mat,v,idx);CHKERRQ(ierr);
(gdb) next

Program received signal SIGSEGV, Segmentation fault.
0x760ca248 in MatGetRowMax_SeqAIJ (A=0x8d1450, v=0x9b1220,
idx=0x99a620) at
/home/sajid/packages/petsc/src/mat/impls/aij/seq/aij.c:3195
3195  x[i] = *aa; if (idx) idx[i] = 0;
(gdb)

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


[petsc-users] Questions about HDF5 parallel I/O

2020-09-03 Thread Sajid Ali
Hi PETSc-developers,

Currently, VecView loads an entire dataset from an hdf5 file into a PETSc
vector (with any number of mpi ranks). Given that there is no routine to
load a subset of an HDF5 dataset into a PETSc vector, the next best thing
is to load the entire data-set into memory and select a smaller region as a
sub-vector. Is there an example that demonstrates this ? (Mainly to get an
idea on how to select a 2d array from a 3d array using a PETSc IS. Given
that it's a regular 3D vector is it best to use a DMDA 3Dvec which gives
ownership ranges that may aid with creating the IS ?)

I've seen on earlier threads that XDMF can be used to create a map of where
data is present in hdf5 files, is there an example for doing this with
regular vectors to select subvectors as described above ?

Also, is it possible to have different sub-comms read different hdf5 groups
?

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Question on usage of PetscMalloc(Re)SetCUDAHost

2020-08-25 Thread Sajid Ali
Hi Barry,

Thanks for the explanation! Removing the calls to
PetscMalloc(Re)SetCUDAHost solved that issue.

Just to clarify, all PetscMalloc(s) happen on the host and there is no
special PetscMalloc for device memory allocation ? (Say for an operation
sequence PetscMalloc1(N, ), VecCUDAGetArray(cudavec, ) )

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


[petsc-users] Question on usage of PetscMalloc(Re)SetCUDAHost

2020-08-25 Thread Sajid Ali
Hi PETSc-developers,

Is it valid to allocate matrix values on host for use on a GPU later by
embedding all allocation logic (i.e the code block that calls PetscMalloc1
for values and indices and sets them using MatSetValues) within a section
marked by PetscMalloc(Re)SetCUDAHost ?

My understanding was that PetscMallocSetCUDAHost would set mallocs to be on
the host but I’m getting an error as shown below (for some strange reason
it happens to be the 5th column on the 0th row (if that helps) both when
setting one value at a time and when setting the whole 0th row together):

[sajid@xrmlite cuda]$ mpirun -np 1 ~/packages/pirt/src/pirt -inputfile
shepplogan.h5
PIRT -- Parallel Iterative Reconstruction Tomography
Reading in real data from shepplogan.h5
After loading data, nTau:100, nTheta:50
After detector geometry context initialization
Initialized PIRT
[0]PETSC ERROR: - Error Message
--
[0]PETSC ERROR: Error in external library
[0]PETSC ERROR: cuda error 1 (cudaErrorInvalidValue) : invalid argument
[0]PETSC ERROR: See
https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
shooting.
[0]PETSC ERROR: Petsc Development GIT revision:
v3.13.2-947-gc2372adeb2  GIT Date: 2020-08-25 21:07:25 +
[0]PETSC ERROR: /home/sajid/packages/pirt/src/pirt on a
arch-linux-c-debug named xrmlite by sajid Tue Aug 25 18:30:55 2020
[0]PETSC ERROR: Configure options --with-hdf5=1 --with-cuda=1
[0]PETSC ERROR: #1 PetscCUDAHostFree() line 14 in
/home/sajid/packages/petsc/src/sys/memory/cuda/mcudahost.cu
[0]PETSC ERROR: #2 PetscFreeA() line 475 in
/home/sajid/packages/petsc/src/sys/memory/mal.c
[0]PETSC ERROR: #3 MatSeqXAIJFreeAIJ() line 135 in
/home/sajid/packages/petsc/include/../src/mat/impls/aij/seq/aij.h
[0]PETSC ERROR: #4 MatSetValues_SeqAIJ() line 498 in
/home/sajid/packages/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #5 MatSetValues() line 1392 in
/home/sajid/packages/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: #6 setMatrixElements() line 248 in
/home/sajid/packages/pirt/src/geom.cxx
[0]PETSC ERROR: #7 construct_matrix() line 91 in
/home/sajid/packages/pirt/src/matrix.cu
[0]PETSC ERROR: #8 main() line 20 in /home/sajid/packages/pirt/src/pirt.cxx
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -inputfile shepplogan.h5
[0]PETSC ERROR: End of Error Message ---send
entire error message to petsc-ma...@mcs.anl.gov--
--
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF
with errorcode 20076.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--
[sajid@xrmlite cuda]$

PetscCUDAHostFree is called within the PetscMalloc(Re)SetCUDAHost block as
described earlier which should’ve created valid memory on the host.

Could someone explain if this is the correct approach to take and what the
above error means ?

(PS : I’ve run ksp tutorial-ex2 with -vec_type cuda -mat_type aijcusparse
to test the installation and everything works as expected.)

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Question on matrix assembly

2020-08-22 Thread Sajid Ali
 Hi Barry,

Thanks for creating the new function. I'm somewhat confused as to how I'd
use it. Given an MPIAIJ matrix, is one supposed to extract the local SeqAIJ
matrix and set the preallocation on each mpi-rank independently followed by
MatSetValues (on the MPIAIJ matrix) to fill the rows? Or, does one create
SeqAIJ matrices on each rank and then combine them into a parallel MPIAIJ
matrix say by using MatCreateMPIMatConcatenateSeqMat?

I tried the second approach but leaving the "number of local columns" for
MatCreateMPIMatConcatenateSeqMat as PETSC_DECIDE causes a crash (when
running with 1 mpi rank). Is this the correct approach to take and if yes
what does "number of local columns" mean when combining the seqaij matrices
?

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Question on matrix assembly

2020-08-14 Thread Sajid Ali
Hi Barry,

All entries for a row are available together, but there is no requirement
to compute them in order.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Question on matrix assembly

2020-08-14 Thread Sajid Ali
@Matthew Knepley  : Thanks for the explanation on
preallocation.

>However, why not have a flag so that on the first pass you do not compute
entries, just the indices?

The matrix computes the projection of an image onto a detector so
generating this involves computing all possible ray-rectangle intersections
and computing the values only differs from computing the indices by a call
to calculate intersection lengths. The process to set up the geometry and
check for intersections is the same to generate indices and values.

So, in this case the tradeoff would be to either compute everything twice
and save on storage cost or compute everything once and use more memory
(essentially compute the matrix rows on each rank, preallocate and then set
the matrix values).

@Stefano Zampini  : Yes, I only need MatMult and
MatMultTranspose in the TAO objective/gradient evaluation but in the
current state it's cheaper to use a matrix instead of computing the
intersections for each objective/gradient evaluation. About ~70% of the
application time is spent in MatMult and MatMultTranspose so we're hoping
that this would benefit from running on GPU's.

Thanks for the pointer to MatShell, implementing a matrix free method is
something we might pursue in the future.

-- 
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] VecLoad into a SubVector ?

2020-07-27 Thread Sajid Ali
Hi Barry/Matt,

The fix to this bug would be to disable replacearray op on a subvector. I
modified the source code for vecio.c forcing VecLoad_HDF5 to always perform
an array copy and the above test passes for both binary and hdf5 viewers in
serial and parallel.

I can open a PR that adds a line Z->ops->replacearray = NULL; at line 1286
in the rvector.c file if one of you can confirm that the above logic is
correct. The example attached in the last email could be used as a test for
the same if necessary.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] VecLoad into a SubVector ?

2020-07-27 Thread Sajid Ali
Hi Barry/Matt,

I now have a simpler test (attached with this email) for this bug which
does the following :

   - Create a vector of size 50, set it to 2 and save to disk.
   - Create a vector of size 100, set it to 1 and extract the last 50
   elements as a subvector.
   - Load the saved vector from disk into the subvector and restore the
   subvector.
   - Test for VecSum, it should be 150.

With one mpi rank (implying no scatters were used in the creation of the
subvector), the above works as expected when using binary IO but if one
uses HDF5 for IO, VecSum outputs 100 showing that the subvector didn’t
restore correctly. Running the executable in gdb I see that for both cases
the VecRestoreSubVector reads the variable VecGetSubVectorSavedStateId as 4
with the boolean variable valid being false.

My guess regarding the origin of the error with HDF5-IO is the fact that
VecLoad_HDF5 uses a VecReplaceArray to load the data and this somehow
messes up the assumptions regarding SubVector data pointers upon creation
by using VecPlaceArray.

As Barry mentioned the check for validity of subvector data is faulty and
would need to be fixed and that should be able to transfer the subvector
data back to the parent vector regardless of how the subvector is modified.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


ex_subvecio.c
Description: Binary data


[petsc-users] VecLoad into a SubVector ?

2020-07-27 Thread Sajid Ali
Hi PETSc-developers,

When I load data (using VecLoad) into a subvector, the parent vector does
not seem to get the data after the subvector is restored. I tried doing a
VecSet to verify that the index set (used to select the subvector) is valid
and the values set by VecSet are transferred back to the parent vector.
Could anyone point out if I'm missing something when I try transferring the
data via a VecLoad from a subvector to the parent vector ?

I'm attaching the code for selecting the subvector and loading data along
with a test hdf5 file (filled with random values). I expect the output
`testload.h5` to be a vector of size 100 with the first 50 being 1 and the
rest being the input values.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


testinput.h5
Description: Binary data


ex2.c
Description: Binary data


Re: [petsc-users] Reuse hypre interpolations between successive KSPOnly solves associated with TSStep ?

2020-05-07 Thread Sajid Ali
Hi Mark,

As Victor explained on the Hypre mailing list, setting
ksp_reuse_preconditioner flag doesn't have the intended effect because SNES
still recomputes the preconditioner at each time step. Setting the flag for
-snes_lag_preconditioner to -1 prevents BoomerAMG from recomputing the
interpolations at each time step.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Converting complex PDE to real for KNL performance ?

2020-04-15 Thread Sajid Ali
 Hi everyone,

As Hong pointed out the optimization variable and gradient are both complex
in my use case. Just to give some context, the TS solves the IVP with the
parameters representing the refractive indices of the object at a given
orientation (Ni orientations in total). The optimization problem to solve
is :
obtain F such that for each θi ⊂ (0; π), obtain  yθi = TS(Aθi ∗ F)
(where Aθi represents a sparse matrix that rotates the F vector by angle
θi.)

Thus, a naive implementation for the same would be :
for i ⊂ (0; Ni) :
- obtain parameters for this orientation by MatMult( Aθi*F )
- obtain yθi = TS(Aθi ∗ F) and y'θi = TSAdjointSolve(Aθi ∗ F) (cost
function being L2 norm of yθi and actual data)
- rotate gradient back by MatMultTranspose(Aθi*F) and update F.

But in the future I'd have preferred to bunch the Ni misfits (with bounds
and regularizers) together as a multi-objective cost function and let TAO
handle the parallelization (whereby TAO is initialized with
`mpi_comm_world` but each PDE evaluation happens in it's own `sub-comm` and
TAO handles the synchronization for updates) and the order of estimations
(instead of naive sequential).

While I don't know the optimization theory behind it, the current practise
in the x-ray community is to model the forward solve using FFT's instead
and use algorithmic differentiation to obtain the gradients. My motivation
for exploring the use of PDE's is due to (a) Adjoint solves being faster
when compared to algorithmic differentiation (b) Multigrid solvers being
fast/optimal (c) PDE models being more accurate on downsampled data.

PS : @Alp : Could you share the slides/manuscript from the siam pp20
meeting that describes the new multi-objective minimization features in TAO
?


Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Converting complex PDE to real for KNL performance ?

2020-04-14 Thread Sajid Ali
Hi Matthew,

The TAO manual states that (preface, page vi) "However, TAO is not
compatible with PETSc installations using complex data types." (The tao
examples all require !complex builds. When I tried to run them with a petsc
build with +complex the compiler complains of incompatible pointer types
and the example crashes at runtime)

Is there any plan to support TAO with complex scalars ?

I had planned to re-use the TS object in an optimization loop with the F
vector defined both as a parameter in TS and as the independent variable in
the outer TAO loop.


Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Converting complex PDE to real for KNL performance ?

2020-04-14 Thread Sajid Ali
Hi Hong,

Apologies for creating unnecessary confusion by continuing the old thread
instead of creating a new one.

While I looked into converting the complex PDE formulation to a real valued
formulation in the past hoping for better performance, my concern now is
with TAO being incompatible with complex scalars. I would've preferred to
keep the complex PDE formulation as is (given that I spent some time tuning
it and it works well now) for cost function and gradient evaluation while
using TAO for the outer optimization loop.

Using TAO has the obvious benefit of defining a multi objective cost
function, parametrized as a fit to a series of measurements and a set of
regularizers while not having to explicitly worry about differentiating the
regularizer or have to think about implementing a good optimization scheme.
But if it converting the complex formulation to a real formulation would
mean a loss of well conditioned forward solve (and increase in solving time
itself), I was wondering if it would be better to keep the complex PDE
formulation and write an optimization loop in PETSc while defining the
regularizer via a cost integrand.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Converting complex PDE to real for KNL performance ?

2020-04-14 Thread Sajid Ali
Hi Jed/PETSc-developers,

My goal is to invert a set of these PDE's to obtain a series of parameters
F_t (with TSSolve and TSAdjoint for function/gradient computation). I was
planning to use TAO for setting up the inverse problem but given that TAO
doesn't support complex scalars, I'm re-thinking about converting this to a
real formulation.

In https://doi.org/10.1007/s00466-006-0047-8, Mark Adams explains that the
K1 formulation of Day/Heroux is well suited to block matrices in PETSc but
the PDE described there has an large SPD operator arising out of the
underlying elliptic PDE. ( Day/Heroux in their paper claim that K2/K3
formulations are problematic for Krylov solvers due to non ideal eigenvalue
spectra created by conversion to real formulation).

Now, the system I have is : u_t = A*(u_xx + u_yy) + F_t*u;  (A is purely
imaginary and F_t is complex, with abs(A/F) ~ 1e-16. The parabolic PDE is
converted to a series of TS solves each being elliptic). I implemented the
K1/K4 approaches by using DMDA to manage the 2-dof grid instead of setting
it up as one large vector of [real,imag] and those didn't converge well
either (at least with simple preconditioners).

Any pointers as to what I could do to make the real formulation well
conditioned ? Or should I not bother with this for now and implement a
first order gradient descent method in PETSc  (while approximating the
regularizer as a cost integrand) ?


Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io

On Wed, Mar 27, 2019 at 9:36 PM Jed Brown  wrote:

> When you roll your own equivalent real formulation, PETSc has no way of
> knowing what conjugate transpose might mean, thus symmetry is lost.  I
> would suggest just using the AVX2 implementation for now and putting in
> a request (or contributing a patch) for AVX-512 complex optimizations.
>
> Sajid Ali via petsc-users  writes:
>
> >  Hi,
> >
> > I'm able to solve the following equation using complex numbers (with
> > ts_type cn and pc_type gamg) :
> >   u_t = A*u'' + F_t*u;
> > (where A = -1j/(2k) amd u'' refers to u_xx+u_yy implemented with the
> > familiar 5-point stencil)
> >
> > Now, I want to solve the same problem using real numbers. The equivalent
> > equations are:
> > u_t_real   =  1/(2k) * u''_imag + F_real*u_real   - F_imag*u_imag
> > u_t_imag = -1/(2k) * u''_real   + F_imag*u_real - F_real*u_imag
> >
> > Thus, if we now take our new u vector to have twice the length of the
> > problem we're solving, keeping the first half as real and the second half
> > as imaginary, we'd get a matrix that had matrices computing the laplacian
> > via the 5-point stencil in the top-right and bottom-left corners and a
> > diagonal [F_real+F_imag, F_real-F_imag] term.
> >
> > I tried doing this and the gamg preconditioner complains about an
> > unsymmetric matrix. If i use the default preconditioner, I get
> > DIVERGED_NONLINEAR_SOLVE.
> >
> > Is there a way to better organize the matrix ?
> >
> > PS: I'm trying to do this using only real numbers because I realized that
> > the optimized avx-512 kernels for KNL are not implemented for complex
> > numbers. Would that be implemented soon ?
> >
> > Thank You,
> > Sajid Ali
> > Applied Physics
> > Northwestern University
>


-- 
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] GAMG parameters for ideal coarsening ratio

2020-03-17 Thread Sajid Ali
 Hi Mark/Jed,

The problem I'm solving is scalar helmholtz in 2D, (u_t = A*u_xx + A*u_yy +
F_t*u, with the familiar 5 point central difference as the derivative
approximation, I'm also attaching the result of -info | grep GAMG if that
helps). My goal is to get weak and strong scaling results for the FD solver
(leading me to double check all my parameters). I ran the sweep again as
Mark suggested and it looks like my base params were close to optimal (
negative threshold and 10 levels of squaring with gmres/jacobi smoothers
(chebyshev/sor is slower)).

[image: image.png]

While I think that the base parameters should work well for strong scaling,
do I have to modify any of my parameters for a weak scaling run ? Does GAMG
automatically increase the number of mg-levels as grid size increases or is
it upon the user to do that ?

@Mark : Is there a GAMG implementation paper I should cite ? I've already
added a citation for the Comput. Mech. (2007) 39: 497–507 as a reference
for the general idea of applying agglomeration type multigrid
preconditioning to helmholtz operators.


Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Questions about TSAdjoint for time dependent parameters

2020-03-12 Thread Sajid Ali
Hi Hong,

For the optimal control example, the cost function has an integral term
which necessitates the setup of a sub-TS quadrature. The Jacobian with
respect to parameter, (henceforth denoted by Jacp) has dimensions that
depend upon the number of steps that the TS integrates for.

I'm trying to implement a simpler case where the cost function doesn't have
an integral term but the parameters are still time dependent. For this, I
modified the standard Van der Pol example (ex20adj.c) to make mu a time
dependent parameter (though it has the same value at all points in time and
I also made the initial conditions & params independent).

Since the structure of Jacp doesn't depend on time (i.e. it is the same at
all points in time, the structure being identical to the time-independent
case), is it necessary that I create a Jacp matrix size whose dimensions
are [dimensions of time-independent Jacp] * -ts_max_steps ? Keeping Jacp
dimensions the same as dimensions of time-independent Jacp causes the
program to crash (possibly due to the fact that Jacp and adjoint vector
can't be multiplied). Ideally, it would be nice to have a Jacp analog of
TSRHSJacobianSetReuse whereby I specify the Jacp routine once and TS knows
how to reuse that at all times. Is this possible with the current
petsc-master ?

Another question I have is regarding exclusive calculation of one adjoint.
If I'm not interested in adjoint with respect to initial conditions, can I
ask TSAdjoing to not calculate that ? Setting the initialization for
adjoint vector with respect to initial conditions to be NULL in
TSSetCostGradients doesn't work.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] Questions about TSAdjoint for time dependent parameters

2020-02-25 Thread Sajid Ali
Hi Hong,

Thanks for the explanation!

If I have a cost function consisting of an L2 norm of the difference of a
TS-solution and some reference along with some constraints (say bounds,
L1-sparsity, total variation etc), would I provide a routine for gradient
evaluation of only the L2 norm (where TAO would take care of the
constraints) or do I also have to take the constraints into account (since
I'd also have to differentiate the regularizers) ?


Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


[petsc-users] Questions about TSAdjoint for time dependent parameters

2020-02-25 Thread Sajid Ali
Hi PETSc-developers,

Could the code used for section 5.1 of the recent paper "PETSc TSAdjoint: a
discrete adjoint ODE solver for first-order and second-order sensitivity
analysis" be shared ? Are there more examples that deal with time dependent
parameters in the git repository ?

Another question I have is regarding the equations used to introduce
adjoints in section 7.1 of the manual where for the state of the solution
vector is denoted by y and the parameters by p.

[1] I'm unsure about what the partial derivative of y0 with respect to p
means since I understand y0 to be the initial conditions used to solve the
TS which would not depend on the parameters (since the parameters are
related to the equations TS tries to solve for which should not dependent
on the initialization used). Could someone clarify what this means ?

[2] The manual described that a user has to set the correct initialization
for the adjoint variables when calling TSSetCostGradients. The
initialization for mu vector is whereby given to be dΦi/dp at t=tF. If p is
time dependent, does one evaluate this derivative with respect to p(t) at
t=tF ?

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


[petsc-users] Required structure and attrs for MatLoad from hdf5

2020-02-04 Thread Sajid Ali
Hi PETSc-developers,

The example src/mat/examples/tutorials/ex10.c shows how one would read a
matrix from a hdf5 file. Since MatView isn’t implemented for hdf5_mat
format, how is the hdf5 file (to be used to run ex10) generated ?

I tried reading from an hdf5 file but I saw an error stating object 'jc'
doesn't exist and thus would like to know how I should store a sparse
matrix in an hdf5 file so that MatLoad works.

PS: I’m guessing that MATLAB stores the matrix in the format that PETSc
expects (group/dset/attrs) but I’m creating this from Python. If the
recommended approach is to transfer numpy arrays to PETSc matrices via
petsc4py, I’d switch to that instead of directly creating hdf5 files.

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] VecDuplicate for FFTW-Vec causes VecDestroy to fail conditionally on VecLoad

2019-11-01 Thread Sajid Ali via petsc-users
 Hi Junchao/Barry,

It doesn't really matter what the h5 file contains,  so I'm attaching a
lightly edited script of src/vec/vec/examples/tutorials/ex10.c which should
produce a vector to be used as input for the above test case. (I'm working
with ` --with-scalar-type=complex`).

Now that I think of it, fixing this bug is not important, I can workaround
the issue by creating a new vector with VecCreateMPI and accept the small
loss in performance of VecPointwiseMult due to misaligned layouts. If it's
a small fix it may be worth the time, but fixing this is not a big priority
right now. If it's a complicated fix, this issue can serve as a note to
future users.


Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io


ex10.c
Description: Binary data


Re: [petsc-users] VecView output to HDF5 in 3.12.0 broken ?

2019-10-16 Thread Sajid Ali via petsc-users
 Hi Barry,

Looking at the current code, am I right in assuming that the change is only
in naming conventions and not in logic? I'll make a MR soon.

Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] VecView output to HDF5 in 3.12.0 broken ?

2019-10-15 Thread Sajid Ali via petsc-users
Hi PETSc-developers,

I think I’ve found the commit that broke this. In MR-1706
<https://gitlab.com/petsc/petsc/merge_requests/1706?commit_id=84ccb19e065d3103c2f8e02df068ca6a03ec0e36>,
the definition of PETSC_HDF5_INT_MAX was changed from being set to
2147483647 to (~(hsize_t)0).

This new definition sets PETSC_HDF5_INT_MAX to 18446744073709551615 thereby
changing the thresholds in the chunking logic at
src/vec/vec/impls/mpi/pdvec.c (which causes the error I’m observing).

I’m not sure where the number 2147483647 comes from but I tried looking at
the older commits only to realize that include/petscviewerhdf5.h has always
had this number (ever since this definition was moved over from
include/petscviewer.h).

Snippet to check value of (~(hsize_t)0) :

(ipy3) [sajid@xrmlite misc]$ cat ex.c
#include "hdf5.h"

int main() {
printf("ref=%llu \n",(~(hsize_t)0));
size_t size = sizeof(hsize_t);
printf("size = %zu\n", size);
}
(ipy3) [sajid@xrmlite misc]$ h5cc ex.c
(ipy3) [sajid@xrmlite misc]$ ./a.out
ref=18446744073709551615
size = 8


Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] VecView output to HDF5 in 3.12.0 broken ?

2019-10-11 Thread Sajid Ali via petsc-users
Also, both versions of PETSc were built with ^hdf5@1.10.5 ^mpich@3.3
%gcc@8.3.0 so the error is most likely not from hdf5.


Re: [petsc-users] VecView output to HDF5 in 3.12.0 broken ?

2019-10-11 Thread Sajid Ali via petsc-users
Hi Stefano/PETSc Developers,

The chunksize is indeed limited to 4GB as per this page :
https://portal.hdfgroup.org/pages/viewpage.action?pageId=48808714.

With a (complex) DMDA vector of size (16384,16384,2) I see that PETSc saves
it as a hdf5 file with chunks of size (1024,1024,2). But with a non DMDA
vector I don't see any chunking happening. I tried examining the chunk size
after running the same example as above and increasing the size of the
vector until it fails to write.

The output of the following case (first size to fail) is attached : mpirun
-np 16 ./ex10 -m 134217728 &> log. There's a slightly different error here
which states :
```
minor: Some MPI function failed
  #018: H5Dchunk.c line 4706 in H5D__chunk_collective_fill(): Invalid
argument, error stack:
PMPI_Type_create_hvector(125): MPI_Type_create_hvector(count=0,
blocklength=-2147483648, stride=0, MPI_BYTE, newtype=0x7ffc177d5bf8) failed
PMPI_Type_create_hvector(80).: Invalid value for blocklen, must be
non-negative but is -2147483648
major: Internal error (too specific to document in detail)
```

Strangely the same case works with 3.11.1 and the dataset has 4 chunks. I'm
not sure how, but it looks like the chunking logic somehow got broken in
3.12.


Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io


log
Description: Binary data


Re: [petsc-users] Question about TSComputeRHSJacobianConstant

2019-09-30 Thread Sajid Ali via petsc-users
Hi PETSc-developers,

Has this bug been fixed in the new 3.12 release ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io


Re: [petsc-users] MPI-FFTW example crashes

2019-06-02 Thread Sajid Ali via petsc-users
 @Barry : Perahps config/BuildSystem/config/packages/fftw.py should use the
--host option when configure for PETSc is run with-batch=1.

If anyone here knows what --host must be set to for KNL, I'd appreciate it.

PS :  I know that Intel-MKL-FFT provides FFTW api. If I'd want to try with
this, is there a way to tell PETSc to pick fftw functions from MKL ?


Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] MPI-FFTW example crashes

2019-06-02 Thread Sajid Ali via petsc-users
 Hi Barry,

fftw-configure fails on login node. I'm attaching the error message at the
bottom of this email. I tried request 1 hour of time on a compute node to
compile fftw on it but for some reason 1 hour is not enough to compile
fftw, hence I was forced to use cray-fftw-3.3.8.1 for which I had no
control over though the manpage gives a lower limit for compiler/MPT/craype
versions which I'm not violating.


```
sajid@thetalogin5:~/packages/petsc> python complex_int_64_fftw_debug.py


===


 Configuring PETSc to compile on your system


===


===


  It appears you do not have valgrind installed on your system.


  We HIGHLY recommend you install it from www.valgrind.org


  Or install valgrind-devel or equivalent using your package manager.


  Then rerun ./configure


===


===


  Trying to download http://www.fftw.org/fftw-3.3.8.tar.gz for FFTW


===


===


  Running configure on FFTW; this may take several minutes


===





***


 UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for
details):

---


Error running configure on FFTW: Could not execute "['./configure
--prefix=/gpfs/mira-home/sajid/packages/petsc/complex_int_64_fftw_debug
MAKE=/usr/bin/gmake --libdir=/gpfs/m
ira-home/sajid/packages/petsc/complex_int_64_fftw_debug/lib CC="cc"
CFLAGS="-fPIC -xMIC-AVX512 -O3"
AR="/opt/cray/pe/cce/8.7.6/binutils/x86_64/x86_64-pc-linux-gnu/bin/ar" ARF
LAGS="cr" LDFLAGS="-dynamic" CXX="CC" CXXFLAGS="-xMIC-AVX512 -O3 -fPIC"
F90="ftn" F90FLAGS="-fPIC -xMIC-AVX512 -O3" F77="ftn" FFLAGS="-fPIC
-xMIC-AVX512 -O3" FC="ftn" FCFLAGS
="-fPIC -xMIC-AVX512 -O3" --enable-shared MPICC="cc" --enable-mpi']":


checking for a BSD-compatible install... /usr/bin/install -c


checking whether build environment is sane... yes


checking for a thread-safe mkdir -p... /usr/bin/mkdir -p


checking for gawk... gawk


checking whether /usr/bin/gmake sets $(MAKE)... yes


checking whether /usr/bin/gmake supports nested variables... yes


checking whether to enable maintainer-specific portions of Makefiles... no


checking build system type... x86_64-pc-linux-gnu


checking host system type... x86_64-pc-linux-gnu


checking for gcc... cc


checking whether the C compiler works... yes


checking for C compiler default output file name... a.out


checking for suffix of executables...


checking whether we are cross compiling...configure: error: in
`/gpfs/mira-home/sajid/packages/petsc/complex_int_64_fftw_debug/externalpackages/fftw-3.3.8':

configure: error: cannot run C compiled programs.


If you meant to cross compile, use `--host'.


See `config.log' for more details


***
  ```





Thank You,
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] MPI-FFTW example crashes

2019-06-02 Thread Sajid Ali via petsc-users
Hi PETSc-developers,

I'm trying to run ex143 on a cluster (alcf-theta). I compiled PETSc on
login node with cray-fftw-3.3.8.1 and there was no error in either
configure or make.

When I try running ex143 with 1 MPI rank on compute node, everything works
fine but with 2 MPI ranks, it crashes due to illegal instruction due to
memory corruption. I tried running it with valgrind but the available
valgrind module on theta gives the error `valgrind: failed to start tool
'memcheck' for platform 'amd64-linux': No such file or directory`.

To get around this, I tried running it with gdb4hpc and I attached the
backtrace which shows that there is some error with mpi-fftw being called.
I also attach the output with -start_in_debugger command option.

What could possibly cause this error and how do I fix it ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University
sajid@thetamom1:/gpfs/mira-home/sajid/sajid_proj/test_fftw> aprun -n 2 --cc 
depth -d 1 -j 1 -r 1 ./ex143 -start_in_debugger -log_view &> out
  
sajid@thetamom1:/gpfs/mira-home/sajid/sajid_proj/test_fftw> cat out 

  
PETSC: Attaching gdb to ./ex143 of pid 62260 on display :0.0 on machine 
nid03832
  
PETSC: Attaching gdb to ./ex143 of pid 62259 on display :0.0 on machine 
nid03832
  
xterm: xterm: Xt error: Can't open display: :0.0

  
Xt error: Can't open display: :0.0  

  
xterm: xterm: DISPLAY is not set

  
DISPLAY is not set  

  
Use PETSc-FFTW interface...1-DIM: 30

  
[1]PETSC ERROR: [0]PETSC ERROR: 

  


  
[0]PETSC ERROR: [1]PETSC ERROR: Caught signal number 4 Illegal instruction: 
Likely due to memory corruption 
  
Caught signal number 4 Illegal instruction: Likely due to memory corruption 

  
[0]PETSC ERROR: [1]PETSC ERROR: Try option -start_in_debugger or 
-on_error_attach_debugger   
 
Try option -start_in_debugger or -on_error_attach_debugger  

  
[0]PETSC ERROR: [1]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
   
or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind 

  
[0]PETSC ERROR: [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and 
Apple Mac OS X to find memory corruption errors 
  
or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory 
corruption errors   

[1]PETSC ERROR: [0]PETSC ERROR: likely location of problem given in stack below 

  
likely location of problem given in stack below 

  
[1]PETSC ERROR: [0]PETSC ERROR: -  Stack Frames 

  
-  Stack Frames 

  
[0]PETSC ERROR: [1]PETSC ERROR: Note: T

Re: [petsc-users] Question about TSComputeRHSJacobianConstant

2019-05-22 Thread Sajid Ali via petsc-users
Hi Matt,

Thanks for the explanation. That makes sense since I'd get reasonably close
convergence with preonly sometimes and not at other times which was
confusing.

Anyway, since there's no pc_tol (analogous to ksp_rtol/ksp_atol, etc), I'd
have to more carefully set the gamg preconditioner options to ensure that
it converges in one run, but since there's no guarantee that what works for
one problem might not work for another (or the same problem at a different
grid size), I'll stick with GMRES+gamg for now.

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Question about TSComputeRHSJacobianConstant

2019-05-22 Thread Sajid Ali via petsc-users
Hi Hong,

Looks like this is my fault since I'm using -ksp_type preonly -pc_type
gamg. If I use the default ksp (GMRES) then everything works fine on a
smaller problem.

Just to confirm,  -ksp_type preonly is to be used only with direct-solve
preconditioners like LU,Cholesky, right ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Question about TSComputeRHSJacobianConstant

2019-05-17 Thread Sajid Ali via petsc-users
Hi Hong,

The solution has the right characteristics but it's off by many orders of
magnitude. It is ~3.5x faster as before.

Am I supposed to keep the TSRHSJacobianSetReuse function or not?


Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Question about TSComputeRHSJacobianConstant

2019-05-16 Thread Sajid Ali via petsc-users
While there is a ~3.5X speedup, deleting the aforementioned 20 lines also
leads the new version of petsc to give the wrong solution (off by orders of
magnitude for the same program).

I tried switching over the the IFunction/IJacobian interface as per the
manual (page 146) which the following lines :
```
TSSetProblemType(ts,TSLINEAR);
TSSetRHSFunction(ts,NULL,TSComputeRHSFunctionLinear,NULL);
TSSetRHSJacobian(ts,A,A,TSComputeRHSJacobianConstant,NULL);
```
are equivalent to :
```
TSSetProblemType(ts,TSLINEAR);
TSSetIFunction(ts,NULL,TSComputeIFunctionLinear,NULL);
TSSetIJacobian(ts,A,A,TSComputeIJacobianConstant,NULL);
```
But the example at src/ts/examples/tutorials/ex3.c employs a strategy of
setting a shift flag to prevent re-computation for time-independent
problems. Moreover, the docs say "using this function
(TSComputeIFunctionLinear) is NOT equivalent to using
TSComputeRHSFunctionLinear()" and now I'm even more confused.

PS : Doing the simple switch is as slow as the original code and the answer
is wrong as well.

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Question about TSComputeRHSJacobianConstant

2019-05-16 Thread Sajid Ali via petsc-users
Hi Barry,

Thanks a lot for pointing this out. I'm seeing ~3X speedup in time !

Attached are the new log files. Does everything look right ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


out_50
Description: Binary data


out_100
Description: Binary data


[petsc-users] Question about TSComputeRHSJacobianConstant

2019-05-16 Thread Sajid Ali via petsc-users
Hi PETSc developers,

I have a question about TSComputeRHSJacobianConstant. If I create a TS (of
type linear) for a problem where the jacobian does not change with time
(set with the aforementioned option) and run it for different number of
time steps, why does the time it takes to evaluate the jacobian change (as
indicated by TSJacobianEval) ?

To clarify, I run with the example with different TSSetTimeStep, but the
same jacobian matrix. I see that the time spent in KSPSolve increases with
increasing number of steps (which is as expected as this is a KSPOnly SNES
solver). But surprisingly, the time spent in TSJacobianEval also increases
with decreasing time-step (or increasing number of steps).

For reference, I attach the log files for two cases which were run with
different time steps and the source code.

Thank You,
Sajid Ali
Applied Physics
Northwestern University


ex_dmda.c
Description: Binary data


out_50
Description: Binary data


out_100
Description: Binary data


Re: [petsc-users] Quick question about ISCreateGeneral

2019-05-01 Thread Sajid Ali via petsc-users
Hi Barry,

I've written a simple program that does a scatter and reverses the order of
data between two vectors with locally generate index sets and it works.
While I'd have expected that I would need to concatenate the index sets
before calling vecscatter, the program works without doing so (hopefully
making it more efficient). Does calling vecscatter on each rank with the
local index set take care of the necessary communication behind the scenes
then?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


ex_modify.c
Description: Binary data


[petsc-users] Quick question about ISCreateGeneral

2019-04-30 Thread Sajid Ali via petsc-users
Hi PETSc Developers,

I see that in the examples for ISCreateGeneral, the index sets are created
by copying values from int arrays (which were created by PetscMalloc1 which
is not collective).

If I the ISCreateGeneral is called with PETSC_COMM_WORLD and the int arrays
on each rank are independently created, does the index set created
concatenate all the int-arrays into one ? If not, what needs to be done to
get such an index set ?

PS: For context, I want to write a fftshift convenience function (like
numpy, MATLAB) but for large distributed vectors. I thought that I could do
this with VecScatter and two index sets, one shifted and one un-shifted.

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Possible Bug ?

2019-04-22 Thread Sajid Ali via petsc-users
This can be tracked down to n vs N being used. The vector in the second
loop is created using N while MatCreateVecsFFTW uses n (for real numbers).
n!=N and hence the error.

If the lines 50/51 and line 91 are switched to MatCreateVecsFFW instead of
MatGetVecs and VecCreateSeq respectively, the example would work for real
numbers as well.


Re: [petsc-users] Possible Bug ?

2019-04-22 Thread Sajid Ali via petsc-users
Hi Barry,

I'm not sure why MatCreateVecsFFW is not used at line 50/51.

The error occurs at line 94 because in the second loop, the example
manually creates the x vector instead of the one created using the A
matrix. For complex numbers this is not an issue but for real numbers the
dimensions don't match. MatCreateVecsFFTW creates a vector of size 20 for
N=10 but VecCreateSeq is creating a vector of size 10. I'm not sure what
the rationale behind this test is.


Re: [petsc-users] Possible Bug ?

2019-04-22 Thread Sajid Ali via petsc-users
Apologies for the post. I didn't see that it was for complex vectors only.

On Mon, Apr 22, 2019 at 5:00 PM Sajid Ali 
wrote:

> Hi,
>
> I see that src/mat/examples/tests/ex112.c is failing for petsc@3.11.1
> configured without complex scalars. With complex scalars everything works
> fine.
>
> The error I see is :
> ```
> [sajid@xrmlite bugfix]$
> ./ex112
>
> No protocol
> specified
>
>
>
>  1-D: FFTW on vector of size
> 10
>
>   Error norm of |x - z|
> 5.37156
>
>   Error norm of |x - z|
> 5.90871
>
>   Error norm of |x - z|
> 5.96243
>
> [0]PETSC ERROR: - Error Message
> --
>
> [0]PETSC ERROR: Nonconforming object
> sizes
>
> [0]PETSC ERROR: Mat mat,Vec x: global dim 20
> 10
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12,
> 2019
>
> [0]PETSC ERROR: ./ex112 on a  named xrmlite.phys.northwestern.edu by
> sajid Mon Apr 22 16:58:41
> 2019
> [0]PETSC ERROR: Configure options
> --prefix=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/petsc-3.11.1-5bdbcozu3labtbbi7gtq4xa
> knay24lo6 --with-ssl=0 --download-c2html=0 --download-sowing=0
> --download-hwloc=0 CFLAGS="-O2 -march=native" FFLAGS="-O2 -march=native"
> CXXFLAGS="-
> O2 -march=native"
> --with-cc=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/mpich-3.3-ig4cr2xw2x63bqs5rnmhfshln4iv7av5/bin/mpic
> c
> --with-cxx=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/mpich-3.3-ig4cr2xw2x63bqs5rnmhfshln4iv7av5/bin/mpic++
> --with-fc=/h
> ome/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/mpich-3.3-ig4cr2xw2x63bqs5rnmhfshln4iv7av5/bin/mpif90
> --with-precision=double --w
> ith-scalar-type=real --with-shared-libraries=1 --with-debugging=1
> --with-64-bit-indices=0
> --with-blaslapack-lib="/home/sajid/packages/spack/opt/spa
>
> ck/linux-centos7-x86_64/gcc-8.3.0/intel-mkl-2019.3.199-kzcly5rtcjbkwtnm6tri6kkexnwoat5m/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/li
> bmkl_intel_lp64.so
> /home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/intel-mkl-2019.3.199-kzcly5rtcjbkwtnm6tri6kkexnwoat5m/compil
> ers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_sequential.so
> /home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/intel-m
> kl-2019.3.199-kzcly5rtcjbkwtnm6tri6kkexnwoat5m/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_core.so
> /lib64/libpthread.so /lib64/
> libm.so /lib64/libdl.so" --with-x=0 --with-clanguage=C --with-scalapack=0
> --with-metis=0 --with-hdf5=0 --with-hypre=0 --with-parmetis=0 --with-mump
> s=0 --with-trilinos=0 --with-fftw=1
> --with-fftw-dir=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/fftw-3.3.8-mkj4ho2jp6xfrnkl
> mrvhdfh73woer2s7 --with-superlu_dist=0 --with-suitesparse=0
> --with-zlib-include=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0
> /zlib-1.2.11-jqrcjdjnrxvouufhjtxbfvfms23fsqpx/include
> --with-zlib-lib="-L/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/zlib-1
> .2.11-jqrcjdjnrxvouufhjtxbfvfms23fsqpx/lib -lz"
> --with-zlib=1
>
> [0]PETSC ERROR: #1 MatMult() line 2385 in
> /tmp/sajid/spack-stage/spack-stage-mTD0AV/petsc-3.11.1/src/mat/interface/matrix.c
>
> [0]PETSC ERROR: #2 main() line 94 in
> /home/sajid/packages/aux_xwp_petsc/bugfix/ex112.c
>
> [0]PETSC ERROR: No PETSc Option Table
> entries
>
> [0]PETSC ERROR: End of Error Message ---send entire
> error message to petsc-ma...@mcs.anl.gov--
>
> application called MPI_Abort(MPI_COMM_WORLD, 60) - process
> 0
>
> [unset]: write_line error; fd=-1 buf=:cmd=abort
> exitcode=60
>
> :
>
> system msg for write_line failure : Bad file
> descriptor
>
>
> ```
>
> I came across this because I saw that MatMult was failing for a new test
> related to a PR I was working on. Is this a bug ?
>
> Thank You,
> Sajid Ali
> Applied Physics
> Northwestern University
>


-- 
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] Possible Bug ?

2019-04-22 Thread Sajid Ali via petsc-users
Hi,

I see that src/mat/examples/tests/ex112.c is failing for petsc@3.11.1
configured without complex scalars. With complex scalars everything works
fine.

The error I see is :
```
[sajid@xrmlite bugfix]$
./ex112

No protocol
specified



 1-D: FFTW on vector of size
10

  Error norm of |x - z|
5.37156

  Error norm of |x - z|
5.90871

  Error norm of |x - z|
5.96243

[0]PETSC ERROR: - Error Message
--

[0]PETSC ERROR: Nonconforming object
sizes

[0]PETSC ERROR: Mat mat,Vec x: global dim 20
10

[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12,
2019

[0]PETSC ERROR: ./ex112 on a  named xrmlite.phys.northwestern.edu by sajid
Mon Apr 22 16:58:41 2019
[0]PETSC ERROR: Configure options
--prefix=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/petsc-3.11.1-5bdbcozu3labtbbi7gtq4xa
knay24lo6 --with-ssl=0 --download-c2html=0 --download-sowing=0
--download-hwloc=0 CFLAGS="-O2 -march=native" FFLAGS="-O2 -march=native"
CXXFLAGS="-
O2 -march=native"
--with-cc=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/mpich-3.3-ig4cr2xw2x63bqs5rnmhfshln4iv7av5/bin/mpic
c
--with-cxx=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/mpich-3.3-ig4cr2xw2x63bqs5rnmhfshln4iv7av5/bin/mpic++
--with-fc=/h
ome/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/mpich-3.3-ig4cr2xw2x63bqs5rnmhfshln4iv7av5/bin/mpif90
--with-precision=double --w
ith-scalar-type=real --with-shared-libraries=1 --with-debugging=1
--with-64-bit-indices=0
--with-blaslapack-lib="/home/sajid/packages/spack/opt/spa
ck/linux-centos7-x86_64/gcc-8.3.0/intel-mkl-2019.3.199-kzcly5rtcjbkwtnm6tri6kkexnwoat5m/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/li
bmkl_intel_lp64.so
/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/intel-mkl-2019.3.199-kzcly5rtcjbkwtnm6tri6kkexnwoat5m/compil
ers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_sequential.so
/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/intel-m
kl-2019.3.199-kzcly5rtcjbkwtnm6tri6kkexnwoat5m/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_core.so
/lib64/libpthread.so /lib64/
libm.so /lib64/libdl.so" --with-x=0 --with-clanguage=C --with-scalapack=0
--with-metis=0 --with-hdf5=0 --with-hypre=0 --with-parmetis=0 --with-mump
s=0 --with-trilinos=0 --with-fftw=1
--with-fftw-dir=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/fftw-3.3.8-mkj4ho2jp6xfrnkl
mrvhdfh73woer2s7 --with-superlu_dist=0 --with-suitesparse=0
--with-zlib-include=/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0
/zlib-1.2.11-jqrcjdjnrxvouufhjtxbfvfms23fsqpx/include
--with-zlib-lib="-L/home/sajid/packages/spack/opt/spack/linux-centos7-x86_64/gcc-8.3.0/zlib-1
.2.11-jqrcjdjnrxvouufhjtxbfvfms23fsqpx/lib -lz"
--with-zlib=1

[0]PETSC ERROR: #1 MatMult() line 2385 in
/tmp/sajid/spack-stage/spack-stage-mTD0AV/petsc-3.11.1/src/mat/interface/matrix.c

[0]PETSC ERROR: #2 main() line 94 in
/home/sajid/packages/aux_xwp_petsc/bugfix/ex112.c

[0]PETSC ERROR: No PETSc Option Table
entries

[0]PETSC ERROR: End of Error Message ---send entire
error message to petsc-ma...@mcs.anl.gov--
application called MPI_Abort(MPI_COMM_WORLD, 60) - process
0

[unset]: write_line error; fd=-1 buf=:cmd=abort
exitcode=60

:

system msg for write_line failure : Bad file
descriptor


```

I came across this because I saw that MatMult was failing for a new test
related to a PR I was working on. Is this a bug ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Error with VecDestroy_MPIFFTW+0x61

2019-04-17 Thread Sajid Ali via petsc-users
Hi Matt/Barry,

I've implemented this for 1D-complex-mpi vec and tested it.

Here is the modified source file ->
https://bitbucket.org/sajid__ali/petsc/src/86fb19b57a7c4f8f42644e5160d2afbdc5e03639/src/mat/impls/fft/fftw/fftw.c

Functions definitions at
https://bitbucket.org/sajid__ali/petsc/src/86fb19b57a7c4f8f42644e5160d2afbdc5e03639/src/mat/impls/fft/fftw/fftw.c#lines-395

New op at
https://bitbucket.org/sajid__ali/petsc/src/86fb19b57a7c4f8f42644e5160d2afbdc5e03639/src/mat/impls/fft/fftw/fftw.c#lines-514

If this looks good, I can extend it to all cases (1/2/3 dims +
real/complex) and add a vecdupliate/vecdestroy pair in the tests.


Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
>Perhaps if spack had an easier mechanism to allow the user to "point to"
local git clones it could get closer to the best of both worlds. Maybe
spack could support a list of local repositories and branches in the yaml
file.

I wonder if a local git clone of petsc can become a "mirror" for petsc
spack package, though this is not the intended use of mirrors. Refer to
https://spack.readthedocs.io/en/latest/mirrors.html


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
> develop > 3.11.99 > 3.10.xx > maint (or other strings)
Just discovered this issue when trying to build with my fork of spack at [1
<https://github.com/s-sajid-ali/spack/commit/05e499571b428f37b8cd1c7d39013e3dec08e5c8>].


So, ideally each developer has to have their develop point to the branch
they want to build ? That would make communication a little confusing since
spack's develop version is some package's master and now everyone wants a
different develop so as to not let spack apply any patches for string
version sorted lower than lowest numeric version.

>Even if you change commit from 'abc' to 'def'spack won't recognize this
change and use the cached tarball.
True, but since checksum changes and the user has to constantly zip and
unzip, I personally find git cloning easier to deal with so it's just a
matter of preference.


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
@Barry: Thanks for the bugfix!

@Satish: Thanks for pointing out this method!

My preferred way previously was to download the source code, unzip, edit,
zip. Now ask spack to not checksum (because my edit has changed stuff) and
build. Lately, spack has added git support and now I create a branch of
spack where I add my bugfix branch as the default build git repo instead of
master to now deal with checksum headaches.


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
Quick question : To drop a print statement at the required location, I need
to modify the source code, build petsc from source and compile with this
new version of petsc, right or is there an easier way? (Just to confirm
before putting in the effort)

On Tue, Apr 16, 2019 at 8:42 PM Smith, Barry F.  wrote:

>
>   Dang, I  ranted too soon.
>
>   I built mpich  using spack (master branch) and a very old Gnu C compiler
> and it produced valgrind clean code. Spack definitely is not passing the
> --enable-g=meminit to MPICH ./configure so this version of MPICH valgrind
> must be clean by default? MPICH's ./configure has
>
> meminit  - Preinitialize memory associated structures and unions to
>eliminate access warnings from programs like valgrind
>
> The default for enable-g is most and
>
> most|yes)
> perform_memtracing=yes
> enable_append_g=yes
> perform_meminit=yes
> perform_dbgmutex=yes
> perform_mutexnesting=yes
> perform_handlealloc=yes
> perform_handle=yes
>
> So it appears that at least some releases of MPICH are suppose to be
> valgrind clean by default ;).
>
> Looking back at Sajid's valgrind output more carefully
>
> Conditional jump or move depends on uninitialised value(s)
> ==15359==at 0x1331069A: __intel_sse4_strncmp (in
> /opt/intel/compilers_and_libraries_2019.1.144/linux/compiler/lib/intel64_lin/libintlc.so.5)
>
> is the only valgrind error. Which I remember seeing from using Intel
> compilers for a long time, nothing to do with MPICH
>
> Thus I conclude that Sajid's code is actually valgrind clean; and I
> withdraw my rant about MPICH/spack
>
> Barry
>
>
>
> > On Apr 16, 2019, at 5:13 PM, Smith, Barry F.  wrote:
> >
> >
> >  So valgrind is printing all kinds of juicy information about
> uninitialized values but it is all worthless because MPICH was not built by
> spack to be valgrind clean. We can't know if any of the problems valgrind
> flags are real. MPICH needs to be configured with the option
> --enable-g=meminit to be valgrind clean. PETSc's --download-mpich always
> installs a valgrind clean MPI.
> >
> > It is unfortunate Spack doesn't provide a variant of MPICH that is
> valgrind clean; actually it should default to valgrind clean MPICH.
> >
> >  Barry
> >
> >
> >
> >
> >> On Apr 16, 2019, at 2:43 PM, Sajid Ali via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
> >>
> >> So, I tried running the debug version with valgrind to see if I can
> find the chunk size that's being set but I don't see it. Is there a better
> way to do it ?
> >>
> >> `$ mpirun -np 32 valgrind ./ex_ms -prop_steps 1 -info &> out`. [The out
> file is attached.]
> >> 
> >
>
>

-- 
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Error with VecDestroy_MPIFFTW+0x61

2019-04-16 Thread Sajid Ali via petsc-users
Hi Barry/Matt,

Since VecDuplicate calls v->ops->duplicate, can't we just add custom
duplicate ops to the (f_in/f_out/b_out) vectors when they are created via
MatCreateFFTW? (just like the custom destroy ops are defined)

Also, what is the PetscObjectStateIncrease function doing ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
Hi Matt,

I tried running the same example with a smaller grid on a workstation and I
see that for a grid size of 8192x8192 (vector write dims 67108864, 2), the
output file has a chunk size of (16777215, 2).

I can’t see HDF5_INT_MAX in the spack build-log (which includes configure).
Is there a better way to look it up?

[sajid@xrmlite .spack]$ cat build.out | grep "HDF"
#define PETSC_HAVE_HDF5 1
#define PETSC_HAVE_LIBHDF5HL_FORTRAN 1
#define PETSC_HAVE_LIBHDF5 1
#define PETSC_HAVE_LIBHDF5_HL 1
#define PETSC_HAVE_LIBHDF5_FORTRAN 1
#define PETSC_HAVE_HDF5_RELEASE_VERSION 5
#define PETSC_HAVE_HDF5_MINOR_VERSION 10
#define PETSC_HAVE_HDF5_MAJOR_VERSION 1

Thank You,
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] VecView to hdf5 broken for large (complex) vectors

2019-04-16 Thread Sajid Ali via petsc-users
Hi PETSc developers,

I’m trying to write a large vector created with VecCreateMPI (size
32768x32768) concurrently from 4 nodes (+32 tasks per node, total 128
mpi-ranks) and I see the following (indicative) error : [Full error log is
here : https://file.io/CdjUfe]

HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 52:
  #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
major: Dataset
minor: Unable to initialize object
  #001: H5Dint.c line 329 in H5D__create_named(): unable to create and
link to dataset
major: Dataset
minor: Unable to initialize object
  #002: H5L.c line 1557 in H5L_link_object(): unable to create new
link to object
major: Links
minor: Unable to initialize object
  #003: H5L.c line 1798 in H5L__create_real(): can't insert link
major: Links
minor: Unable to insert object
  #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal failed
major: Symbol table
HDF5-DIAG: Error detected in HDF5 (1.10.5) MPI-process 59:
  #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
major: Dataset
minor: Unable to initialize object
  #001: H5Dint.c line 329 in H5D__create_named(): unable to create and
link to dataset
major: Dataset
minor: Unable to initialize object
  #002: H5L.c line 1557 in H5L_link_object(): unable to create new
link to object
major: Links
minor: Unable to initialize object
  #003: H5L.c line 1798 in H5L__create_real(): can't insert link
major: Links
minor: Unable to insert object
  #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path
traversal failed
major: Symbol table
minor: Object not found
  #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal
operator failed
major: Symbol table
minor: Callback failed
  #006: H5L.c line 1604 in H5L__link_cb(): unable to create object
major: Links
minor: Unable to initialize object
  #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object
major: Object header
minor: Can't open object
  #008: H5Doh.c line 300 in H5O__dset_create(): unable to create
dataset
minor: Object not found
  #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal
operator failed
major: Symbol table
minor: Callback failed
  #006: H5L.c line 1604 in H5L__link_cb(): unable to create object
major: Links
minor: Unable to initialize object
  #007: H5Oint.c line 2453 in H5O_obj_create(): unable to open object
major: Object header
minor: Can't open object
  #008: H5Doh.c line 300 in H5O__dset_create(): unable to create
dataset
major: Dataset
minor: Unable to initialize object
  #009: H5Dint.c line 1274 in H5D__create(): unable to construct
layout information
major: Dataset
minor: Unable to initialize object
  #010: H5Dchunk.c line 872 in H5D__chunk_construct(): unable to set chunk sizes
major: Dataset
minor: Bad value
  #011: H5Dchunk.c line 831 in H5D__chunk_set_sizes(): chunk size must be < 4GB
major: Dataset
minor: Unable to initialize object
major: Dataset
minor: Unable to initialize object
  #009: H5Dint.c line 1274 in H5D__create(): unable to construct
layout information
major: Dataset
minor: Unable to initialize object
  #010: H5Dchunk.c line 872 in H5D__chunk_construct(): unable to set chunk sizes
major: Dataset
minor: Bad value
  #011: H5Dchunk.c line 831 in H5D__chunk_set_sizes(): chunk size must be < 4GB
major: Dataset
minor: Unable to initialize object
...

I spoke to Barry last evening who said that this is a known error that was
fixed for DMDA vecs but is broken for non-dmda vecs.

Could this be fixed ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Error with VecDestroy_MPIFFTW+0x61

2019-04-15 Thread Sajid Ali via petsc-users
Hi Barry & Matt,

I'd be happy to contribute a patch once I understand what's going on.

@Matt, Where is the padding occurring? In the VecCreateFFTW I see that each
process looks up the dimension of array it's supposed to hold and asks for
memory to hold that via fftw_malloc (which as you say is just a wrapper to
simd-aligned malloc). Is the crash occurring because the first vector was
created via fftw_malloc and duplicated via PETScMalloc and they happen to
have different alignment sizes (FFTW was compiled with simd=avx2 since I'm
using a Broadwell-Xeon and PETScMalloc aligns to PETSC_MEMALIGN ?)

PS: I've only ever used FFTW via the python interface (and automated the
build & but couldn't automate testing of pyfftw-mpi since cython coverage
reporting is confusing).

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Error with VecDestroy_MPIFFTW+0x61

2019-04-14 Thread Sajid Ali via petsc-users
Thanks for the temporary fix.

(PS: I was wondering if it would be trivial to just extend the code to have
four mallocs and create a new function but it looks like the logic is much
more complicated.)


Re: [petsc-users] Error with VecDestroy_MPIFFTW+0x61

2019-04-14 Thread Sajid Ali via petsc-users
 Hi Matt,

While in theory, that sounds perfect I still get the same error. I'm
attaching a minimal test program which creates 3 vectors x,y,z via the
petsc-fftw interface and a test vector via VecDuplicate and then destroy
all the vectors. Without the test vector everything works fine.

Thank You,
Sajid Ali
Applied Physics
Northwestern University


ex_modify.c
Description: Binary data


[petsc-users] Error with VecDestroy_MPIFFTW+0x61

2019-04-14 Thread Sajid Ali via petsc-users
Hi PETSc Developers,

I happen to be a user who needs 4 vectors whose layout is aligned with
FFTW. The usual MatCreateVecsFFTW allows one to make 3 such vectors. To get
around this I call the function twice, once with three vectors, once with
one vector (and 2x NULL). This causes a strange segfault when freeing the
memory with VecDestroy (for the lone vector I'm guessing).

My program runs with no issues if I create the fourth vector with
VecCreateMPI (I need the 4th vector for point-wise multiply so this would
be ineffiient).

To get around this problem, is there a way to ask for 4 vectors aligned to
the FFTW matrix? If not is there a way to get the intended behavior from
VecCreateMPI, (perhaps by using a helper function to determine the data
alignment and pass to it instead of using PETSC_DECIDE)?

I'm attaching my code just in case what I'm thinking is wrong and anyone
would be kind enough to point it out to me. The issue is at line 87/88.
With 87, the program crashes, with 88 it works fine.

Thanks in advance for the help!

-- 
Sajid Ali
Applied Physics
Northwestern University


ex_ms.c
Description: Binary data


Re: [petsc-users] How to build FFTW3 interface?

2019-04-12 Thread Sajid Ali via petsc-users
Hi Balay,

Confirming that the spack variant works. Thanks for adding it.


[petsc-users] How to build FFTW3 interface?

2019-04-11 Thread Sajid Ali via petsc-users
Hi PETSc Developers,

To run an example that involves the petsc-fftw interface, I loaded both
petsc and fftw modules (linked of course to the same mpi) but the compiler
complains of having no knowledge of functions like MatCreateVecsFFTW which
happens to be defined at (in the source repo)
petsc/src/mat/impls/fft/fftw.c. I don't see a corresponding definition in
the install folder (I may be wrong, but i just did a simple grep to find
the definition of the function I'm looking for and didn't find it while it
was present in the header and example files).

>From previous threads on this list-serv I see that the developers asked
users to use --download-fftw at configure time, but for users that already
have an fftw installed, is there an option to ask petsc to build the
interfaces as well (I didn't see any such option listed either here:
https://www.mcs.anl.gov/petsc/documentation/installation.html or a variant
in spack) ?

Also, could the fftw version to download be bumped to 3.3.8 (here :
petsc/config/BuildSystem/config/packages/fftw.py) since 3.3.7 gives
erroneous results with gcc-8.

Bug in fftw-3.3.7+gcc-8 :
https://github.com/FFTW/fftw3/commit/19eeeca592f63413698f23dd02b9961f22581803


Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Argument out of range error only in certain mpi sizes

2019-04-11 Thread Sajid Ali via petsc-users
One last question I have is : Does PETSc automatically chose a good chunk
size for the size of the vector it has and use it to write the dataset ? Or
is this something I shouldn't really worry about (not that it affects me
now but it would be good to not have a slow read from a python script for
post-processing).


Re: [petsc-users] Argument out of range error only in certain mpi sizes

2019-04-10 Thread Sajid Ali via petsc-users
Thanks a lot for the advice Matt and Barry.

One thing I wanted to confirm is that when I change from using a regular
Vec to a Vec created using DMDACreateGlobalVector, to fill these with data
from hdf5, I have to change the dimensions of hdf5 vectors from
(dim_x*dim_y) to (dim_x,dim_y), right?

Because I see that if I write to hdf5 from a complex vector created using
DMDA, I get a vector that has dimensions (dim_x,dim_y,2) but before I saw
the dimension of the same to be (dim_x*dim_y,2).

-- 
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] Argument out of range error only in certain mpi sizes

2019-04-10 Thread Sajid Ali via petsc-users
Hi PETSc developers,

I wanted to convert my code that in which I was using general Vec/Mat to
DMDA based grid management (nothing fancy just a 5-point complex stencil).
For this, I created a DA object and created the global solution vector
using this. This worked fine.

Now, I created the matrix using DMCreateMatrix and fill it using the
regular function. I get no error when I run the problem using mpirun -np 1
and I thought my matrix filling logic aligns with the DM non-zero locations
(Star Stencil, width 1). With mpirun -np 2, no errors either. But with
mpirun -np 4 or 8, I get errors which say : "Argument out of range/
Inserting a new nonzero at global row/column ... into matrix".

I would switch over the logic provided in KSP/ex46.c for filling the Matrix
via the MatSetStencil logic but I wanted to know what I was doing wrong in
my current code since I've never seen an error that depends on number of
mpi processes.

I'm attaching the code ( which works if the matrix is created without using
the DA, i.e. comment out line 159, uncomment 161/162 and I'm doing this on
a small grid to catch errors.).

Thanks in advance for the help.


-- 
Sajid Ali
Applied Physics
Northwestern University


ex_dmda.c
Description: Binary data


[petsc-users] Estimate memory needs for large grids

2019-04-05 Thread Sajid Ali via petsc-users
Hi,

I've solving a simple linear equation [ u_t = A*u_xx + A*u_yy + F_t*u ] on
a grid size of 55296x55296. I'm reading a vector of that size from an hdf5
file and have the jacobian matrix as a modified 5-point stencil which is
preallocated with the following
```
  ierr = MatCreate(PETSC_COMM_WORLD,);CHKERRQ(ierr);
  ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,M,M);CHKERRQ(ierr);
  ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr);
  ierr = MatSetFromOptions(A);CHKERRQ(ierr);
  ierr = MatMPIAIJSetPreallocation(A,5,NULL,5,NULL);CHKERRQ(ierr);
  ierr = MatSeqAIJSetPreallocation(A,5,NULL);CHKERRQ(ierr);
```
Total number of elements is ~3e9 and the matrix size is ~9e9 (but only 5
diagonals are non zeros). I'm reading F_t which has ~3e9 elements. I'm
using double complex numbers and I've compiled with int64 indices.

Thus, for the vector I need, 55296x55296x2x8 bytes ~ 50Gb and for the F
vector, another 50 Gb. For the matrix I need ~250 Gb and some overhead for
the solver.

How do I estimate this overhead (and estimate how many nodes I would need
to run this given the maximum memory per node (as specified by slurm's
--mem option)) ?

Thanks in advance for the help!

-- 
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] Converting complex PDE to real for KNL performance ?

2019-03-27 Thread Sajid Ali via petsc-users
 Hi,

I'm able to solve the following equation using complex numbers (with
ts_type cn and pc_type gamg) :
  u_t = A*u'' + F_t*u;
(where A = -1j/(2k) amd u'' refers to u_xx+u_yy implemented with the
familiar 5-point stencil)

Now, I want to solve the same problem using real numbers. The equivalent
equations are:
u_t_real   =  1/(2k) * u''_imag + F_real*u_real   - F_imag*u_imag
u_t_imag = -1/(2k) * u''_real   + F_imag*u_real - F_real*u_imag

Thus, if we now take our new u vector to have twice the length of the
problem we're solving, keeping the first half as real and the second half
as imaginary, we'd get a matrix that had matrices computing the laplacian
via the 5-point stencil in the top-right and bottom-left corners and a
diagonal [F_real+F_imag, F_real-F_imag] term.

I tried doing this and the gamg preconditioner complains about an
unsymmetric matrix. If i use the default preconditioner, I get
DIVERGED_NONLINEAR_SOLVE.

Is there a way to better organize the matrix ?

PS: I'm trying to do this using only real numbers because I realized that
the optimized avx-512 kernels for KNL are not implemented for complex
numbers. Would that be implemented soon ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Direct PETSc to use MCDRAM on KNL and other optimizations for KNL

2019-03-01 Thread Sajid Ali via petsc-users
Hi Hong,

So, the speedup was coming from increased DRAM bandwidth and not the usage
of MCDRAM.

There is moderate MPI imbalance, large amount of Back-End stalls and good
vectorization.

I'm attaching my submit script, PETSc log file and Intel APS summary (all
as non-HTML text). I can give more detailed analysis via Intel Vtune if
needed.


Thank You,
Sajid Ali
Applied Physics
Northwestern University


submit_script
Description: Binary data


intel_aps_report
Description: Binary data


knl_petsc
Description: Binary data


Re: [petsc-users] Direct PETSc to use MCDRAM on KNL and other optimizations for KNL

2019-02-28 Thread Sajid Ali via petsc-users
Hi Hong,

Thanks for the advice. I see that the example takes ~180 seconds to run but
I can't see the DRAM vs MCDRAM info from Intel APS. I'll try to fix the
profiling and get back with further questions.

Also, the intel-mpi manpages say that the use of tmi is now deprecated :
https://software.intel.com/en-us/mpi-developer-guide-linux-fabrics-control


Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Direct PETSc to use MCDRAM on KNL and other optimizations for KNL

2019-02-27 Thread Sajid Ali via petsc-users
Hi Junchao,

I’m confused with the syntax. If I submit the following as my job script, I
get an error :

#!/bin/bash
#SBATCH --job-name=petsc_test
#SBATCH -N 1
#SBATCH -C knl,quad,flat
#SBATCH -p apsxrmd
#SBATCH --time=1:00:00

module load intel/18.0.3-d6gtsxs
module load intel-parallel-studio/cluster.2018.3-xvnfrfz
module load numactl-2.0.12-intel-18.0.3-wh44iog
srun -n 64 -c 64 --cpu_bind=cores numactl -m 1 aps ./ex_modify
-ts_type cn -prop_steps 25 -pc_type gamg -ts_monitor -log_view

The error is :
srun: cluster configuration lacks support for cpu binding
srun: error: Unable to create step for job 916208: More processors
requested than permitted

I’m following the advice as given at slide 33 of
https://www.nersc.gov/assets/Uploads/02-using-cori-knl-nodes-20170609.pdf

For further info, I’m using LCRC at ANL.

Thank You,
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] Link to the latest tutorials broken

2019-02-07 Thread Sajid Ali via petsc-users
Hi,

The links to the Jan 2019 presentations at
https://www.mcs.anl.gov/petsc/documentation/tutorials/index.html are
broken. Could these be fixed ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] petsc4py for numpy array <-> MATDENSE binary

2019-02-01 Thread Sajid Ali via petsc-users
The vector is essentially snapshots in time of a data array. I should
probably store this as a 2D dense matrix of dimensions (dim_x*dim_y) *
dim_z. Now I can pick one column at a time and use it for my TS Jacobian.
Apologies for being a little unclear.

-- 
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Reading a complex vector from HDF5

2019-02-01 Thread Sajid Ali via petsc-users
I think I understand what's happening here, when I look at a data file I
create similar to the aforementioned example, I see a complex=1 attribute
that I'm missing when I make my hdf5 file.


Re: [petsc-users] Reading a complex vector from HDF5

2019-02-01 Thread Sajid Ali via petsc-users
Column 1 contains the real value and column 2 contains the imaginary value,
correct?

I did that last time as well (and opened it using h5py just to be sure that
the shape is indeed dim x 2 and the datatype is f8),  yet I get the error.

The error comes from these lines in PETSc :

#if defined(PETSC_USE_COMPLEX)
if (!h->complexVal) {
H5T_class_t clazz = H5Tget_class(datatype);
if (clazz == H5T_FLOAT) SETERRQ(PetscObjectComm((PetscObject)viewer),
PETSC_ERR_SUP,"File contains real numbers but PETSc is configured for
complex. The conversion is not yet implemented. Configure with
--with-scalar-type=real.");
}

Am I setting the dtype incorrectly?


[petsc-users] Reading a complex vector from HDF5

2019-02-01 Thread Sajid Ali via petsc-users
Hi,

I'm trying to load a complex vector from hdf5 and I get and I get the
following error:

[0]PETSC ERROR: - Error Message
--
[0]PETSC ERROR: No support for this operation for this object type
[0]PETSC ERROR: File contains real numbers but PETSc is configured for
complex. The conversion is not yet implemented. Configure with
--with-scalar-type=real.
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision:
ba892ce6f06cbab637e31ab2bf33fd996c92d19d  GIT Date: 2019-01-29 05:55:16
+0100
[0]PETSC ERROR: ./ex_modify on a  named xrm by sajid Fri Feb  1 14:18:30
2019
[0]PETSC ERROR: Configure options
--prefix=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/petsc-develop-gh5lollh4dzklppm7kb7wpjtyh4gi4mi
--with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0
CFLAGS="-march=native -O2" FFLAGS="-march=native -O2"
CXXFLAGS="-march=native -O2"
--with-cc=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/mpich-3.3-x4tesuo4sxsomfxns5i26vco7ywojdmz/bin/mpicc
--with-cxx=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/mpich-3.3-x4tesuo4sxsomfxns5i26vco7ywojdmz/bin/mpic++
--with-fc=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/mpich-3.3-x4tesuo4sxsomfxns5i26vco7ywojdmz/bin/mpif90
--with-precision=double --with-scalar-type=complex
--with-shared-libraries=1 --with-debugging=0 --with-64-bit-indices=0
COPTFLAGS= FOPTFLAGS= CXXOPTFLAGS=
--with-blaslapack-lib=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/openblas-0.3.5-heqggft2eii2xktdbyp2zovez2yj6ugd/lib/libopenblas.so
--with-x=1 --with-clanguage=C --with-scalapack=0 --with-metis=1
--with-metis-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/metis-5.1.0-mugo6fn7u2nlz72vo4ajlhbba5q7a5jn
--with-hdf5=1
--with-hdf5-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/hdf5-1.10.4-pes54zx5fsbgdg2nr7reyopc4zfc43wo
--with-hypre=0 --with-parmetis=1
--with-parmetis-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/parmetis-4.0.3-kizvrkoos4ddlfc7jsnatvh3mdk3lgvg
--with-mumps=0 --with-trilinos=0 --with-cxx-dialect=C++11
--with-superlu_dist-include=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/superlu-dist-develop-d6dqq3ribpkyoawnwuckbtydgddijj6j/include
--with-superlu_dist-lib=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/superlu-dist-develop-d6dqq3ribpkyoawnwuckbtydgddijj6j/lib/libsuperlu_dist.a
--with-superlu_dist=1 --with-suitesparse=0
--with-zlib-include=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/zlib-1.2.11-v4lhujkxdy6ukr5ymjxgxcfb2qoo6vf3/include
--with-zlib-lib="-L/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-8.2.0/zlib-1.2.11-v4lhujkxdy6ukr5ymjxgxcfb2qoo6vf3/lib
-lz" --with-zlib=1
[0]PETSC ERROR: #1 PetscViewerHDF5Load() line 1348 in
/tmp/sajid/spack-stage/spack-stage-hy86Yb/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c
[0]PETSC ERROR: #2 VecLoad_HDF5() line 170 in
/tmp/sajid/spack-stage/spack-stage-hy86Yb/petsc/src/vec/vec/utils/vecio.c
[0]PETSC ERROR: #3 VecLoad_Default() line 288 in
/tmp/sajid/spack-stage/spack-stage-hy86Yb/petsc/src/vec/vec/utils/vecio.c
[0]PETSC ERROR: #4 VecLoad() line 933 in
/tmp/sajid/spack-stage/spack-stage-hy86Yb/petsc/src/vec/vec/interface/vector.c
[0]PETSC ERROR: #5 main() line 132 in
/raid/home/sajid/packages/xwp_petsc/2d/matter_repeat/ex_modify.c
[0]PETSC ERROR: No PETSc Option Table entries
[0]PETSC ERROR: End of Error Message ---send entire
error message to petsc-ma...@mcs.anl.gov--

As per src/vec/vec/examples/tutorials/ex10.c , a complex 1D vector is saves
as a 2D real valued vector of shape dim,2. I've done the same with my data
and yet I get the above error. Is there any special precaution I should
take when making the HDF5 file ?

Thank You,
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] petsc4py for numpy array <-> MATDENSE binary

2019-01-31 Thread Sajid Ali via petsc-users
Hi,

I have a large (~25k x 25k x 50) 3D array that I want to store as binary
PETSc readable matrix using the MATDENSE format. I've currently achieved
this (for smaller arrays) by writing the matrix out to an ASCII file and
converting this ASCII file to binary, one number at a time using a C
script.

Does petsc4py support converting a 3d numpy array (complex valued) to
MATDENSE (MPI) binary format directly ?

I saw some discussion on this topic back in 2012 on the mailing list but
it's not clear to me what happened after that.

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Question about TSSetRHSJacobian for linear time dependent problem

2019-01-27 Thread Sajid Ali via petsc-users
The form is u_t = A(t)u.

On Sun, Jan 27, 2019 at 4:24 PM Smith, Barry F.  wrote:

>
>
> > On Jan 25, 2019, at 4:51 PM, Sajid Ali via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
> >
> > Hi,
> >
> > If I have a linear time dependent equation I'm trying to solve using TS,
> I can use :
> > TSSetProblemType(ts,TS_LINEAR);
> > TSSetRHSFunction(ts,NULL,TSComputeRHSFunctionLinear,NULL);
> > TSSetRHSJacobian(ts,A,A,YourComputeRHSJacobian, );
> >
> > If the matrix that's being evaluated by YourComputeRHSJacobian is such
> that the non-zero structure stays the same and only the diagonal changes
> with time, is there a way to optimize the function so that it doesn't
> create the whole matrix from scratch each time ?
>
> If it is a linear PDE u_t = A u  then how can A change with time? It
> sounds like it really isn't a linear problem?
>
>Barry
>
> >
> > Naively I can make a dummy matrix and store the copy from t=0 and change
> the diagonal at each iteration but that unnecessarily doubles the memory
> consumption, is there a better way?
> >
> >
> > Thank You,
> > Sajid Ali
> > Applied Physics
> > Northwestern University
>
>

-- 
Sajid Ali
Applied Physics
Northwestern University


[petsc-users] Question about TSSetRHSJacobian for linear time dependent problem

2019-01-25 Thread Sajid Ali via petsc-users
Hi,

If I have a linear time dependent equation I'm trying to solve using TS, I
can use :
TSSetProblemType(ts,TS_LINEAR);
TSSetRHSFunction(ts,NULL,TSComputeRHSFunctionLinear,NULL);
TSSetRHSJacobian(ts,A,A,YourComputeRHSJacobian, );

If the matrix that's being evaluated by YourComputeRHSJacobian is such that
the non-zero structure stays the same and only the diagonal changes with
time, is there a way to optimize the function so that it doesn't create the
whole matrix from scratch each time ?

Naively I can make a dummy matrix and store the copy from t=0 and change
the diagonal at each iteration but that unnecessarily doubles the memory
consumption, is there a better way?


Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] VecCopy fails after VecDuplicate

2019-01-17 Thread Sajid Ali via petsc-users
Nevermind, it was my fault in thinking the error was with u_abs and not u.
I switched from local array based value setting for initial conditions to
VecSetValues when converting the uniprocessor example to an MPI program.
While I removed VecRestoreArray and swapped u_local[*ptr] assignments with
VecSetValues, I missed out on adding VecAssembleBegin/End to compensate.
Thanks for pointing out that the error was with u and not u_abs.

On Thu, Jan 17, 2019 at 1:29 PM Sajid Ali 
wrote:

> As requested :
>
> [sajid@xrm free_space]$ ./ex_modify
> Solving a linear TS problem on 1 processor
> m : 256, slices : 1000.00, lambda : 1.239800e-10
> [0]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: Object is in wrong state
> [0]PETSC ERROR: Not for unassembled vector
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision:
> 0cd88d33dca7e1f18a10cbb6fcb08f83d068c5f4  GIT Date: 2019-01-06 13:27:26
> -0600
> [0]PETSC ERROR: ./ex_modify on a  named xrm by sajid Thu Jan 17 13:29:12
> 2019
> [0]PETSC ERROR: Configure options
> --prefix=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj
> --with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0
> CFLAGS= FFLAGS= CXXFLAGS=
> --with-cc=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/mpich-3.3-z5uiwmx24jylnivuhlnqjjmm674ozj6x/bin/mpicc
> --with-cxx=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/mpich-3.3-z5uiwmx24jylnivuhlnqjjmm674ozj6x/bin/mpic++
> --with-fc=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/mpich-3.3-z5uiwmx24jylnivuhlnqjjmm674ozj6x/bin/mpif90
> --with-precision=double --with-scalar-type=complex
> --with-shared-libraries=1 --with-debugging=1 --with-64-bit-indices=0
> --with-debugging=%s
> --with-blaslapack-lib="/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/intel-mkl-2019.0.117-wzqlcijwx7odz2x5chembudo5leqpfh2/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64/libmkl_intel_lp64.so
> /raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/intel-mkl-2019.0.117-wzqlcijwx7odz2x5chembudo5leqpfh2/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64/libmkl_sequential.so
> /raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/intel-mkl-2019.0.117-wzqlcijwx7odz2x5chembudo5leqpfh2/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64/libmkl_core.so
> /lib64/libpthread.so /lib64/libm.so /lib64/libdl.so" --with-x=1
> --with-clanguage=C --with-scalapack=0 --with-metis=1
> --with-metis-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/metis-5.1.0-nhgzn4kjskctzmzv35mstvd34nj2ugek
> --with-hdf5=1
> --with-hdf5-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/hdf5-1.10.4-ltstvsxvyjue2gxfegi4nvr6c5xg3zww
> --with-hypre=0 --with-parmetis=1
> --with-parmetis-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/parmetis-4.0.3-hw3j2ss7mjsc5x5f2gaflirnuufzptil
> --with-mumps=0 --with-trilinos=0 --with-cxx-dialect=C++11
> --with-superlu_dist-include=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/superlu-dist-develop-cpspq4ca2hnyvhx4mz7zsupbj3do6md3/include
> --with-superlu_dist-lib=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/superlu-dist-develop-cpspq4ca2hnyvhx4mz7zsupbj3do6md3/lib/libsuperlu_dist.a
> --with-superlu_dist=1 --with-suitesparse=0
> --with-zlib-include=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/zlib-1.2.11-ldu43taplg2nbkxtem346zq4ibhad64i/include
> --with-zlib-lib="-L/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/zlib-1.2.11-ldu43taplg2nbkxtem346zq4ibhad64i/lib
> -lz" --with-zlib=1
> [0]PETSC ERROR: #1 VecCopy() line 1571 in
> /tmp/sajid/spack-stage/spack-stage-nwxY3Q/petsc/src/vec/vec/interface/vector.c
> [0]PETSC ERROR: #2 Monitor() line 296 in
> /raid/home/sajid/packages/xwp_petsc/1d/free_space/ex_modify.c
> [0]PETSC ERROR: #3 TSMonitor() line 3929 in
> /tmp/sajid/spack-stage/spack-stage-nwxY3Q/petsc/src/ts/interface/ts.c
> [0]PETSC ERROR: #4 TSSolve() line 3843 in
> /tmp/sajid/spack-stage/spack-stage-nwxY3Q/petsc/src/ts/interface/ts.c
> [0]PETSC ERROR: #5 main() line 188 in
> /raid/home/sajid/packages/xwp_petsc/1d/free_space/ex_modify.c
> [0]PETSC ERROR: No PETSc Option Table entries
> [0]PETSC ERROR: End of Error Message ---send entire
> error message to petsc-ma...@mcs.anl.gov--
> application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0
> [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=73
> :
> system msg for write_line failure : Bad file descriptor
>
>

-- 
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] VecCopy fails after VecDuplicate

2019-01-17 Thread Sajid Ali via petsc-users
As requested :

[sajid@xrm free_space]$ ./ex_modify
Solving a linear TS problem on 1 processor
m : 256, slices : 1000.00, lambda : 1.239800e-10
[0]PETSC ERROR: - Error Message
--
[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: Not for unassembled vector
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision:
0cd88d33dca7e1f18a10cbb6fcb08f83d068c5f4  GIT Date: 2019-01-06 13:27:26
-0600
[0]PETSC ERROR: ./ex_modify on a  named xrm by sajid Thu Jan 17 13:29:12
2019
[0]PETSC ERROR: Configure options
--prefix=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj
--with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0
CFLAGS= FFLAGS= CXXFLAGS=
--with-cc=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/mpich-3.3-z5uiwmx24jylnivuhlnqjjmm674ozj6x/bin/mpicc
--with-cxx=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/mpich-3.3-z5uiwmx24jylnivuhlnqjjmm674ozj6x/bin/mpic++
--with-fc=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/mpich-3.3-z5uiwmx24jylnivuhlnqjjmm674ozj6x/bin/mpif90
--with-precision=double --with-scalar-type=complex
--with-shared-libraries=1 --with-debugging=1 --with-64-bit-indices=0
--with-debugging=%s
--with-blaslapack-lib="/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/intel-mkl-2019.0.117-wzqlcijwx7odz2x5chembudo5leqpfh2/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64/libmkl_intel_lp64.so
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/intel-mkl-2019.0.117-wzqlcijwx7odz2x5chembudo5leqpfh2/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64/libmkl_sequential.so
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/intel-mkl-2019.0.117-wzqlcijwx7odz2x5chembudo5leqpfh2/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64/libmkl_core.so
/lib64/libpthread.so /lib64/libm.so /lib64/libdl.so" --with-x=1
--with-clanguage=C --with-scalapack=0 --with-metis=1
--with-metis-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/metis-5.1.0-nhgzn4kjskctzmzv35mstvd34nj2ugek
--with-hdf5=1
--with-hdf5-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/hdf5-1.10.4-ltstvsxvyjue2gxfegi4nvr6c5xg3zww
--with-hypre=0 --with-parmetis=1
--with-parmetis-dir=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/parmetis-4.0.3-hw3j2ss7mjsc5x5f2gaflirnuufzptil
--with-mumps=0 --with-trilinos=0 --with-cxx-dialect=C++11
--with-superlu_dist-include=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/superlu-dist-develop-cpspq4ca2hnyvhx4mz7zsupbj3do6md3/include
--with-superlu_dist-lib=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/superlu-dist-develop-cpspq4ca2hnyvhx4mz7zsupbj3do6md3/lib/libsuperlu_dist.a
--with-superlu_dist=1 --with-suitesparse=0
--with-zlib-include=/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/zlib-1.2.11-ldu43taplg2nbkxtem346zq4ibhad64i/include
--with-zlib-lib="-L/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/zlib-1.2.11-ldu43taplg2nbkxtem346zq4ibhad64i/lib
-lz" --with-zlib=1
[0]PETSC ERROR: #1 VecCopy() line 1571 in
/tmp/sajid/spack-stage/spack-stage-nwxY3Q/petsc/src/vec/vec/interface/vector.c
[0]PETSC ERROR: #2 Monitor() line 296 in
/raid/home/sajid/packages/xwp_petsc/1d/free_space/ex_modify.c
[0]PETSC ERROR: #3 TSMonitor() line 3929 in
/tmp/sajid/spack-stage/spack-stage-nwxY3Q/petsc/src/ts/interface/ts.c
[0]PETSC ERROR: #4 TSSolve() line 3843 in
/tmp/sajid/spack-stage/spack-stage-nwxY3Q/petsc/src/ts/interface/ts.c
[0]PETSC ERROR: #5 main() line 188 in
/raid/home/sajid/packages/xwp_petsc/1d/free_space/ex_modify.c
[0]PETSC ERROR: No PETSc Option Table entries
[0]PETSC ERROR: End of Error Message ---send entire
error message to petsc-ma...@mcs.anl.gov--
application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=73
:
system msg for write_line failure : Bad file descriptor


[petsc-users] VecCopy fails after VecDuplicate

2019-01-17 Thread Sajid Ali via petsc-users
Hi,

I have the following 2 lines in a function in my code :

 ierr = VecDuplicate(u,_abs);CHKERRQ(ierr);
 ierr = VecCopy(u,u_abs);CHKERRQ(ierr);

The VecCopy fails with the error message :
[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: Not for unassembled vector

adding the following statements doesn't help either :
ierr = VecAssemblyBegin(u_abs);CHKERRQ(ierr);
ierr = VecAssemblyEnd(u_abs);CHKERRQ(ierr);

If needed, the entire file is at :
https://github.com/s-sajid-ali/xwp_petsc/blob/master/1d/free_space/ex_modify.c

Thank You,
Sajid Ali
Applied Physics
Northwestern University


Re: [petsc-users] Question about correctly catching fp_trap

2019-01-07 Thread Sajid Ali via petsc-users
 >   Anyway the FPE occurs in MatSolve_SeqAIJ_NaturalOrdering which usually
indicates a zero on the diagonal of the matrix. Is that possible?
It looks like this is indeed the case here. Thanks for the hint.

@Satish Balay  : I tried building with the patch and
don't see any difference. Do you want me to send you the config and build
logs to investigate further? Apart from the -g flag, as I've stated above
another bug is that petsc uses the system gdb (rhel-7) and not the gdb
associated with the gcc that was used to build petsc.


Re: [petsc-users] Question about correctly catching fp_trap

2019-01-04 Thread Sajid Ali via petsc-users
Trying it slightly differently, I do see it's a SIGFPE, arithmetic
exception but all it shows is that it's an error in TSSolve but no further
than that.

[sajid@xrm free_space]$ gdb ex_modify
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from
/raid/home/sajid/packages/xwp_petsc/2d/free_space/ex_modify...done.
(gdb) run -ts_type cn --args -fp_trap
Starting program:
/raid/home/sajid/packages/xwp_petsc/2d/free_space/ex_modify -ts_type cn
--args -fp_trap
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
warning: File
"/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-4.8.5/gcc-7.3.0-qrjpi76aeo4bysagruwwfii6oneh56lj/lib64/libstdc++.
so.6.0.24-gdb.py" auto-loading has been declined by your `auto-load
safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py".
To enable execution of this file add
add-auto-load-safe-path
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-4.8.5/gcc-7.3.0-qrjpi76aeo4bysagruwwfii6oneh56lj/lib64/libstdc++.
so.6.0.24-gdb.py
line to your configuration file "/raid/home/sajid/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/raid/home/sajid/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the
shell:
info "(gdb)Auto-loading safe path"
Solving a linear TS problem on 1 processor
mx : 512, my: 512 lambda : 1.239840e-10

Program received signal SIGFPE, Arithmetic exception.
__muldc3 (a=-6.6364040265716871e-306, b=1.1689456061105587e-305,
c=-0.0024992568840190117, d=0.024886737403015963)
at
/raid/home/sajid/packages/spack/var/spack/stage/gcc-7.3.0-qrjpi76aeo4bysagruwwfii6oneh56lj/gcc-7.3.0/libgcc/libgcc2.c:1978
1978
/raid/home/sajid/packages/spack/var/spack/stage/gcc-7.3.0-qrjpi76aeo4bysagruwwfii6oneh56lj/gcc-7.3.0/libgcc/libgcc2.c:
No such file or directory.
Missing separate debuginfos, use: debuginfo-install blas-3.4.2-8.el7.x86_64
bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.168-8.el7.x86_64
elfutils-libs-0.168-8.el7.x86_64 glibc-2.17-196.el7.x86_64
lapack-3.4.2-8.el7.x86_64 libattr-2.4.46-12.el7.x86_64
libcap-2.22-9.el7.x86_64 libgfortran-4.8.5-16.el7.x86_64
libxml2-2.9.1-6.el7_2.3.x86_64 systemd-libs-219-42.el7_4.4.x86_64
xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  __muldc3 (a=-6.6364040265716871e-306, b=1.1689456061105587e-305,
c=-0.0024992568840190117, d=0.024886737403015963)
at
/raid/home/sajid/packages/spack/var/spack/stage/gcc-7.3.0-qrjpi76aeo4bysagruwwfii6oneh56lj/gcc-7.3.0/libgcc/libgcc2.c:1978
#1  0x7630fe87 in MatSolve_SeqAIJ_NaturalOrdering ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#2  0x75ba61a8 in MatSolve ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#3  0x76bc8a55 in PCApply_ILU ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#4  0x76cde6eb in PCApply ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#5  0x76e3ad4a in KSP_PCApply ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#6  0x76e3bc36 in KSPInitialResidual ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#7  0x76dc0736 in KSPSolve_GMRES ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#8  0x76e1158e in KSPSolve ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#9  0x76fac311 in SNESSolve_KSPONLY ()
   from
/raid/home/sajid/packages/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0/petsc-develop-2u6vuwagkoczyvnpsubzrubmtmpfhhkj/lib/libpetsc.so.3.010
#10 0x76f346c7 in SNESSolve ()
   from

  1   2   >