Re: [petsc-users] MatAssemblyBegin freezes during MPI communication

2024-01-18 Thread 袁煕
Thanks for your explanation.

It seems that it is due to my calling of MatDiagonalSet() before
MatAssemblyBegin(). My problem is resolved by putting MatDiagonalSet()
after MatAssemblyBegin().

Much thanks for your help.
Xi YUAN, PhD Solid Mechanics

2024年1月18日(木) 22:20 Junchao Zhang :

>
>
>
> On Thu, Jan 18, 2024 at 1:47 AM 袁煕  wrote:
>
>> Dear PETSc Experts,
>>
>> My FEM program works well generally, but in some specific cases with
>> multiple CPUs are used, it freezes when calling MatAssemblyBegin where
>> PMPI_Allreduce is called (see attached file).
>>
>> After some investigation, I found that it is most probably due to
>>
>> ・ MatSetValue is not called from all CPUs before MatAssemblyBegin
>>
>> For example, when 4 CPUs are used, if there are elements in CPU 0,1,2 but
>> no elements in CPU 3, then all CPUs other than CPU 3 would call
>> MatSetValue  function. I want to know
>>
>> 1. If my conjecture could be right? And If so
>>
> No.  All processes do MPI_Allreduce to know if there are incoming values
> set by others.  To know why hanging, you can attach gdb to all MPI
> processes to see where they are.
>
>>
>>
> 2. Are there any convenient means to avoid this problem?
>>
>> Thanks,
>> Xi YUAN, PhD Solid Mechanics
>>
>


Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   Thanks. Same version I tried. 


> On Jan 18, 2024, at 6:09 PM, Yesypenko, Anna  wrote:
> 
> Hi Barry,
> 
> I'm using version 3.20.3. The tacc system is lonestar6.
> 
> Best,
> Anna
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Sent: Thursday, January 18, 2024 4:43 PM
> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
> Cc: petsc-users@mcs.anl.gov  
> mailto:petsc-users@mcs.anl.gov>>; Victor Eijkhout 
> mailto:eijkh...@tacc.utexas.edu>>
> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>  
> 
>Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, 
> even increased the problem size without producing any problems. Both versions 
> of the Python code. 
> 
>Anna,
> 
>What version of PETSc are you using?
> 
>Victor,
> 
>Does anyone at ANL have access to this TACC system to try to reproduce?
> 
> 
>   Barry
> 
>
> 
>> On Jan 18, 2024, at 4:38 PM, Barry Smith > > wrote:
>> 
>> 
>>It is using the hash map system for inserting values which only inserts 
>> on the CPU, not on the GPU. So I don't see that it would be moving any data 
>> to the GPU until the mat assembly() is done which it never gets to. Hence I 
>> have trouble understanding why the GPU has anything to do with the crash. 
>> 
>>I guess I need to try to reproduce it on a GPU system.
>> 
>>Barry
>> 
>> 
>> 
>> 
>>> On Jan 18, 2024, at 4:28 PM, Matthew Knepley >> > wrote:
>>> 
>>> On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna >> > wrote:
>>> Hi Matt, Barry,
>>> 
>>> Apologies for the extra dependency on scipy. I can replicate the error by 
>>> calling setValue (i,j,v) in a loop as well.
>>> In roughly half of 10 runs, the following script fails because of an error 
>>> in hashmapijv – the same as my original post.
>>> It successfully runs without error the other times.
>>> 
>>> Barry is right that it's CUDA specific. The script runs fine on the CPU.
>>> Do you have any suggestions or example scripts on assigning entries to a 
>>> AIJCUSPARSE matrix?
>>> 
>>> Oh, you definitely do not want to be doing this. I believe you would rather
>>> 
>>> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.
>>> 
>>> 2) Produce the values on the GPU and call
>>> 
>>>   https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
>>>   https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/
>>> 
>>>   This is what most people do who are forming matrices directly on the GPU.
>>> 
>>> What you are currently doing is incredibly inefficient, and I think 
>>> accounts for you running out of memory.
>>> It talks back and forth between the CPU and GPU.
>>> 
>>>   Thanks,
>>> 
>>>  Matt
>>> 
>>> Here is a minimum snippet that doesn't depend on scipy.
>>> ```
>>> from petsc4py import PETSc
>>> import numpy as np
>>> 
>>> n = int(5e5); 
>>> nnz = 3 * np.ones(n, dtype=np.int32)
>>> nnz[0] = nnz[-1] = 2
>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>>> A.setType('aijcusparse')
>>> 
>>> A.setValue(0, 0, 2)
>>> A.setValue(0, 1, -1)
>>> A.setValue(n-1, n-2, -1)
>>> A.setValue(n-1, n-1, 2)
>>> 
>>> for index in range(1, n - 1):
>>>  A.setValue(index, index - 1, -1)
>>>  A.setValue(index, index, 2)
>>>  A.setValue(index, index + 1, -1)
>>> A.assemble()
>>> ```
>>> If it means anything to you, when the hash error occurs, it is for index 
>>> 67283 after filling 201851 nonzero values.
>>> 
>>> Thank you for your help and suggestions!
>>> Anna
>>> 
>>> From: Barry Smith mailto:bsm...@petsc.dev>>
>>> Sent: Thursday, January 18, 2024 2:35 PM
>>> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
>>> Cc: petsc-users@mcs.anl.gov  
>>> mailto:petsc-users@mcs.anl.gov>>
>>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>>>  
>>> 
>>>Do you ever get a problem with 'aij` ?   Can you run in a loop with 
>>> 'aij' to confirm it doesn't fail then?
>>> 
>>>
>>> 
>>>Barry
>>> 
>>> 
 On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna >>> > wrote:
 
 Dear Petsc users/developers,
 
 I'm experiencing a bug when using petsc4py with GPU support. It may be my 
 mistake in how I set up a AIJCUSPARSE matrix.
 For larger matrices, I sometimes encounter a error in assigning matrix 
 values; the error is thrown in PetscHMapIJVQuerySet().
 Here is a minimum snippet that populates a sparse tridiagonal matrix. 
 
 ```
 from petsc4py import PETSc
 from scipy.sparse import diags
 import numpy as np
 
 n = int(5e5); 
 
 nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
 A = PETSc.Mat(comm=PETSc.COMM_WORLD)
 A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
 A.setType('aijcusparse')
 tmp = 

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Yesypenko, Anna
Hi Barry,

I'm using version 3.20.3. The tacc system is lonestar6.

Best,
Anna

From: Barry Smith 
Sent: Thursday, January 18, 2024 4:43 PM
To: Yesypenko, Anna 
Cc: petsc-users@mcs.anl.gov ; Victor Eijkhout 

Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix


   Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, 
even increased the problem size without producing any problems. Both versions 
of the Python code.

   Anna,

   What version of PETSc are you using?

   Victor,

   Does anyone at ANL have access to this TACC system to try to reproduce?


  Barry



On Jan 18, 2024, at 4:38 PM, Barry Smith  wrote:


   It is using the hash map system for inserting values which only inserts on 
the CPU, not on the GPU. So I don't see that it would be moving any data to the 
GPU until the mat assembly() is done which it never gets to. Hence I have 
trouble understanding why the GPU has anything to do with the crash.

   I guess I need to try to reproduce it on a GPU system.

   Barry




On Jan 18, 2024, at 4:28 PM, Matthew Knepley  wrote:

On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna 
mailto:a...@oden.utexas.edu>> wrote:
Hi Matt, Barry,

Apologies for the extra dependency on scipy. I can replicate the error by 
calling setValue (i,j,v) in a loop as well.
In roughly half of 10 runs, the following script fails because of an error in 
hashmapijv – the same as my original post.
It successfully runs without error the other times.

Barry is right that it's CUDA specific. The script runs fine on the CPU.
Do you have any suggestions or example scripts on assigning entries to a 
AIJCUSPARSE matrix?

Oh, you definitely do not want to be doing this. I believe you would rather

1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.

2) Produce the values on the GPU and call

  https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
  https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/

  This is what most people do who are forming matrices directly on the GPU.

What you are currently doing is incredibly inefficient, and I think accounts 
for you running out of memory.
It talks back and forth between the CPU and GPU.

  Thanks,

 Matt

Here is a minimum snippet that doesn't depend on scipy.
```
from petsc4py import PETSc
import numpy as np

n = int(5e5);
nnz = 3 * np.ones(n, dtype=np.int32)
nnz[0] = nnz[-1] = 2
A = PETSc.Mat(comm=PETSc.COMM_WORLD)
A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
A.setType('aijcusparse')

A.setValue(0, 0, 2)
A.setValue(0, 1, -1)
A.setValue(n-1, n-2, -1)
A.setValue(n-1, n-1, 2)

for index in range(1, n - 1):
 A.setValue(index, index - 1, -1)
 A.setValue(index, index, 2)
 A.setValue(index, index + 1, -1)
A.assemble()
```
If it means anything to you, when the hash error occurs, it is for index 67283 
after filling 201851 nonzero values.

Thank you for your help and suggestions!
Anna


From: Barry Smith mailto:bsm...@petsc.dev>>
Sent: Thursday, January 18, 2024 2:35 PM
To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
Cc: petsc-users@mcs.anl.gov 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix


   Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' to 
confirm it doesn't fail then?



   Barry


On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna 
mailto:a...@oden.utexas.edu>> wrote:

Dear Petsc users/developers,

I'm experiencing a bug when using petsc4py with GPU support. It may be my 
mistake in how I set up a AIJCUSPARSE matrix.
For larger matrices, I sometimes encounter a error in assigning matrix values; 
the error is thrown in PetscHMapIJVQuerySet().
Here is a minimum snippet that populates a sparse tridiagonal matrix.

```
from petsc4py import PETSc
from scipy.sparse import diags
import numpy as np

n = int(5e5);

nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
A = PETSc.Mat(comm=PETSc.COMM_WORLD)
A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
A.setType('aijcusparse')
tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
### this is the line where the error is thrown.
A.assemble()
```

The error trace is below:
```
File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
  File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
petsc4py.PETSc.matsetvalues_csr
  File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
petsc4py.PETSc.matsetvalues_ijv
petsc4py.PETSc.Error: error code 76
[0] MatSetValues() at 
/work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
[0] MatSetValues_Seq_Hash() at 
/work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
[0] PetscHMapIJVQuerySet() at 
/work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
[0] Error in 

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   Ok, I ran it on an ANL machine with CUDA and it worked fine for many runs, 
even increased the problem size without producing any problems. Both versions 
of the Python code. 

   Anna,

   What version of PETSc are you using?

   Victor,

   Does anyone at ANL have access to this TACC system to try to reproduce?


  Barry

   

> On Jan 18, 2024, at 4:38 PM, Barry Smith  wrote:
> 
> 
>It is using the hash map system for inserting values which only inserts on 
> the CPU, not on the GPU. So I don't see that it would be moving any data to 
> the GPU until the mat assembly() is done which it never gets to. Hence I have 
> trouble understanding why the GPU has anything to do with the crash. 
> 
>I guess I need to try to reproduce it on a GPU system.
> 
>Barry
> 
> 
> 
> 
>> On Jan 18, 2024, at 4:28 PM, Matthew Knepley  wrote:
>> 
>> On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna > > wrote:
>>> Hi Matt, Barry,
>>> 
>>> Apologies for the extra dependency on scipy. I can replicate the error by 
>>> calling setValue (i,j,v) in a loop as well.
>>> In roughly half of 10 runs, the following script fails because of an error 
>>> in hashmapijv – the same as my original post.
>>> It successfully runs without error the other times.
>>> 
>>> Barry is right that it's CUDA specific. The script runs fine on the CPU.
>>> Do you have any suggestions or example scripts on assigning entries to a 
>>> AIJCUSPARSE matrix?
>> 
>> Oh, you definitely do not want to be doing this. I believe you would rather
>> 
>> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.
>> 
>> 2) Produce the values on the GPU and call
>> 
>>   https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
>>   https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/
>> 
>>   This is what most people do who are forming matrices directly on the GPU.
>> 
>> What you are currently doing is incredibly inefficient, and I think accounts 
>> for you running out of memory.
>> It talks back and forth between the CPU and GPU.
>> 
>>   Thanks,
>> 
>>  Matt
>> 
>>> Here is a minimum snippet that doesn't depend on scipy.
>>> ```
>>> from petsc4py import PETSc
>>> import numpy as np
>>> 
>>> n = int(5e5); 
>>> nnz = 3 * np.ones(n, dtype=np.int32)
>>> nnz[0] = nnz[-1] = 2
>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>>> A.setType('aijcusparse')
>>> 
>>> A.setValue(0, 0, 2)
>>> A.setValue(0, 1, -1)
>>> A.setValue(n-1, n-2, -1)
>>> A.setValue(n-1, n-1, 2)
>>> 
>>> for index in range(1, n - 1):
>>>  A.setValue(index, index - 1, -1)
>>>  A.setValue(index, index, 2)
>>>  A.setValue(index, index + 1, -1)
>>> A.assemble()
>>> ```
>>> If it means anything to you, when the hash error occurs, it is for index 
>>> 67283 after filling 201851 nonzero values.
>>> 
>>> Thank you for your help and suggestions!
>>> Anna
>>> 
>>> From: Barry Smith mailto:bsm...@petsc.dev>>
>>> Sent: Thursday, January 18, 2024 2:35 PM
>>> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
>>> Cc: petsc-users@mcs.anl.gov  
>>> mailto:petsc-users@mcs.anl.gov>>
>>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>>>  
>>> 
>>>Do you ever get a problem with 'aij` ?   Can you run in a loop with 
>>> 'aij' to confirm it doesn't fail then?
>>> 
>>>
>>> 
>>>Barry
>>> 
>>> 
 On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna >>> > wrote:
 
 Dear Petsc users/developers,
 
 I'm experiencing a bug when using petsc4py with GPU support. It may be my 
 mistake in how I set up a AIJCUSPARSE matrix.
 For larger matrices, I sometimes encounter a error in assigning matrix 
 values; the error is thrown in PetscHMapIJVQuerySet().
 Here is a minimum snippet that populates a sparse tridiagonal matrix. 
 
 ```
 from petsc4py import PETSc
 from scipy.sparse import diags
 import numpy as np
 
 n = int(5e5); 
 
 nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
 A = PETSc.Mat(comm=PETSc.COMM_WORLD)
 A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
 A.setType('aijcusparse')
 tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
 A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
 ### this is the line where the error is thrown.
 A.assemble()
 ```
 
 The error trace is below:
 ```
 File "petsc4py/PETSc/Mat.pyx", line 2603, in 
 petsc4py.PETSc.Mat.setValuesCSR
   File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
 petsc4py.PETSc.matsetvalues_csr
   File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
 petsc4py.PETSc.matsetvalues_ijv
 petsc4py.PETSc.Error: error code 76
 [0] MatSetValues() at 
 /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
 [0] MatSetValues_Seq_Hash() at 
 

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Yesypenko, Anna
Hi all,

Matt's suggestions worked great! The script works consistently now.
What I was doing is a bad way to populate sparse matrices on the GPU – I'm not 
sure why it fails but luckily we found a fix.

Thank you all for your help and suggestions!

Best,
Anna


From: Barry Smith 
Sent: Thursday, January 18, 2024 3:38 PM
To: Yesypenko, Anna 
Cc: petsc-users@mcs.anl.gov ; Victor Eijkhout 

Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix


   It is using the hash map system for inserting values which only inserts on 
the CPU, not on the GPU. So I don't see that it would be moving any data to the 
GPU until the mat assembly() is done which it never gets to. Hence I have 
trouble understanding why the GPU has anything to do with the crash.

   I guess I need to try to reproduce it on a GPU system.

   Barry




On Jan 18, 2024, at 4:28 PM, Matthew Knepley  wrote:

On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna 
mailto:a...@oden.utexas.edu>> wrote:
Hi Matt, Barry,

Apologies for the extra dependency on scipy. I can replicate the error by 
calling setValue (i,j,v) in a loop as well.
In roughly half of 10 runs, the following script fails because of an error in 
hashmapijv – the same as my original post.
It successfully runs without error the other times.

Barry is right that it's CUDA specific. The script runs fine on the CPU.
Do you have any suggestions or example scripts on assigning entries to a 
AIJCUSPARSE matrix?

Oh, you definitely do not want to be doing this. I believe you would rather

1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.

2) Produce the values on the GPU and call

  https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
  https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/

  This is what most people do who are forming matrices directly on the GPU.

What you are currently doing is incredibly inefficient, and I think accounts 
for you running out of memory.
It talks back and forth between the CPU and GPU.

  Thanks,

 Matt

Here is a minimum snippet that doesn't depend on scipy.
```
from petsc4py import PETSc
import numpy as np

n = int(5e5);
nnz = 3 * np.ones(n, dtype=np.int32)
nnz[0] = nnz[-1] = 2
A = PETSc.Mat(comm=PETSc.COMM_WORLD)
A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
A.setType('aijcusparse')

A.setValue(0, 0, 2)
A.setValue(0, 1, -1)
A.setValue(n-1, n-2, -1)
A.setValue(n-1, n-1, 2)

for index in range(1, n - 1):
 A.setValue(index, index - 1, -1)
 A.setValue(index, index, 2)
 A.setValue(index, index + 1, -1)
A.assemble()
```
If it means anything to you, when the hash error occurs, it is for index 67283 
after filling 201851 nonzero values.

Thank you for your help and suggestions!
Anna


From: Barry Smith mailto:bsm...@petsc.dev>>
Sent: Thursday, January 18, 2024 2:35 PM
To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
Cc: petsc-users@mcs.anl.gov 
mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix


   Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' to 
confirm it doesn't fail then?



   Barry


On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna 
mailto:a...@oden.utexas.edu>> wrote:

Dear Petsc users/developers,

I'm experiencing a bug when using petsc4py with GPU support. It may be my 
mistake in how I set up a AIJCUSPARSE matrix.
For larger matrices, I sometimes encounter a error in assigning matrix values; 
the error is thrown in PetscHMapIJVQuerySet().
Here is a minimum snippet that populates a sparse tridiagonal matrix.

```
from petsc4py import PETSc
from scipy.sparse import diags
import numpy as np

n = int(5e5);

nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
A = PETSc.Mat(comm=PETSc.COMM_WORLD)
A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
A.setType('aijcusparse')
tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
### this is the line where the error is thrown.
A.assemble()
```

The error trace is below:
```
File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
  File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
petsc4py.PETSc.matsetvalues_csr
  File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
petsc4py.PETSc.matsetvalues_ijv
petsc4py.PETSc.Error: error code 76
[0] MatSetValues() at 
/work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
[0] MatSetValues_Seq_Hash() at 
/work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
[0] PetscHMapIJVQuerySet() at 
/work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
[0] Error in external library
[0] [khash] Assertion: `ret >= 0' failed.
```

If I run the same script a handful of times, it will run without errors 
eventually.
Does anyone have insight on why it is behaving this 

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   It is using the hash map system for inserting values which only inserts on 
the CPU, not on the GPU. So I don't see that it would be moving any data to the 
GPU until the mat assembly() is done which it never gets to. Hence I have 
trouble understanding why the GPU has anything to do with the crash. 

   I guess I need to try to reproduce it on a GPU system.

   Barry




> On Jan 18, 2024, at 4:28 PM, Matthew Knepley  wrote:
> 
> On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna  > wrote:
>> Hi Matt, Barry,
>> 
>> Apologies for the extra dependency on scipy. I can replicate the error by 
>> calling setValue (i,j,v) in a loop as well.
>> In roughly half of 10 runs, the following script fails because of an error 
>> in hashmapijv – the same as my original post.
>> It successfully runs without error the other times.
>> 
>> Barry is right that it's CUDA specific. The script runs fine on the CPU.
>> Do you have any suggestions or example scripts on assigning entries to a 
>> AIJCUSPARSE matrix?
> 
> Oh, you definitely do not want to be doing this. I believe you would rather
> 
> 1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.
> 
> 2) Produce the values on the GPU and call
> 
>   https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
>   https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/
> 
>   This is what most people do who are forming matrices directly on the GPU.
> 
> What you are currently doing is incredibly inefficient, and I think accounts 
> for you running out of memory.
> It talks back and forth between the CPU and GPU.
> 
>   Thanks,
> 
>  Matt
> 
>> Here is a minimum snippet that doesn't depend on scipy.
>> ```
>> from petsc4py import PETSc
>> import numpy as np
>> 
>> n = int(5e5); 
>> nnz = 3 * np.ones(n, dtype=np.int32)
>> nnz[0] = nnz[-1] = 2
>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>> A.setType('aijcusparse')
>> 
>> A.setValue(0, 0, 2)
>> A.setValue(0, 1, -1)
>> A.setValue(n-1, n-2, -1)
>> A.setValue(n-1, n-1, 2)
>> 
>> for index in range(1, n - 1):
>>  A.setValue(index, index - 1, -1)
>>  A.setValue(index, index, 2)
>>  A.setValue(index, index + 1, -1)
>> A.assemble()
>> ```
>> If it means anything to you, when the hash error occurs, it is for index 
>> 67283 after filling 201851 nonzero values.
>> 
>> Thank you for your help and suggestions!
>> Anna
>> 
>> From: Barry Smith mailto:bsm...@petsc.dev>>
>> Sent: Thursday, January 18, 2024 2:35 PM
>> To: Yesypenko, Anna mailto:a...@oden.utexas.edu>>
>> Cc: petsc-users@mcs.anl.gov  
>> mailto:petsc-users@mcs.anl.gov>>
>> Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix
>>  
>> 
>>Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' 
>> to confirm it doesn't fail then?
>> 
>>
>> 
>>Barry
>> 
>> 
>>> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna >> > wrote:
>>> 
>>> Dear Petsc users/developers,
>>> 
>>> I'm experiencing a bug when using petsc4py with GPU support. It may be my 
>>> mistake in how I set up a AIJCUSPARSE matrix.
>>> For larger matrices, I sometimes encounter a error in assigning matrix 
>>> values; the error is thrown in PetscHMapIJVQuerySet().
>>> Here is a minimum snippet that populates a sparse tridiagonal matrix. 
>>> 
>>> ```
>>> from petsc4py import PETSc
>>> from scipy.sparse import diags
>>> import numpy as np
>>> 
>>> n = int(5e5); 
>>> 
>>> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
>>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>>> A.setType('aijcusparse')
>>> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
>>> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
>>> ### this is the line where the error is thrown.
>>> A.assemble()
>>> ```
>>> 
>>> The error trace is below:
>>> ```
>>> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
>>>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
>>> petsc4py.PETSc.matsetvalues_csr
>>>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
>>> petsc4py.PETSc.matsetvalues_ijv
>>> petsc4py.PETSc.Error: error code 76
>>> [0] MatSetValues() at 
>>> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
>>> [0] MatSetValues_Seq_Hash() at 
>>> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
>>> [0] PetscHMapIJVQuerySet() at 
>>> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
>>> [0] Error in external library
>>> [0] [khash] Assertion: `ret >= 0' failed.
>>> ```
>>> 
>>> If I run the same script a handful of times, it will run without errors 
>>> eventually.
>>> Does anyone have insight on why it is behaving this way? I'm running on a 
>>> node with 3x NVIDIA A100 PCIE 40GB.
>>> 
>>> Thank you!
>>> Anna
>> 
> 
> 
> --
> What 

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 4:18 PM Yesypenko, Anna 
wrote:

> Hi Matt, Barry,
>
> Apologies for the extra dependency on scipy. I can replicate the error by
> calling setValue (i,j,v) in a loop as well.
> In roughly half of 10 runs, the following script fails because of an error
> in hashmapijv – the same as my original post.
> It successfully runs without error the other times.
>
> Barry is right that it's CUDA specific. The script runs fine on the CPU.
> Do you have any suggestions or example scripts on assigning entries to a
> AIJCUSPARSE matrix?
>

Oh, you definitely do not want to be doing this. I believe you would rather

1) Make the CPU matrix and then convert to AIJCUSPARSE. This is efficient.

2) Produce the values on the GPU and call

  https://petsc.org/main/manualpages/Mat/MatSetPreallocationCOO/
  https://petsc.org/main/manualpages/Mat/MatSetValuesCOO/

  This is what most people do who are forming matrices directly on the GPU.

What you are currently doing is incredibly inefficient, and I think
accounts for you running out of memory.
It talks back and forth between the CPU and GPU.

  Thanks,

 Matt

Here is a minimum snippet that doesn't depend on scipy.
> ```
> from petsc4py import PETSc
> import numpy as np
>
> n = int(5e5);
> nnz = 3 * np.ones(n, dtype=np.int32)
> nnz[0] = nnz[-1] = 2
> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
> A.setType('aijcusparse')
>
> A.setValue(0, 0, 2)
> A.setValue(0, 1, -1)
> A.setValue(n-1, n-2, -1)
> A.setValue(n-1, n-1, 2)
>
> for index in range(1, n - 1):
>  A.setValue(index, index - 1, -1)
>  A.setValue(index, index, 2)
>  A.setValue(index, index + 1, -1)
> A.assemble()
> ```
> If it means anything to you, when the hash error occurs, it is for index
> 67283 after filling 201851 nonzero values.
>
> Thank you for your help and suggestions!
> Anna
>
> --
> *From:* Barry Smith 
> *Sent:* Thursday, January 18, 2024 2:35 PM
> *To:* Yesypenko, Anna 
> *Cc:* petsc-users@mcs.anl.gov 
> *Subject:* Re: [petsc-users] HashMap Error when populating AIJCUSPARSE
> matrix
>
>
>Do you ever get a problem with 'aij` ?   Can you run in a loop with
> 'aij' to confirm it doesn't fail then?
>
>
>
>Barry
>
>
> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna  wrote:
>
> Dear Petsc users/developers,
>
> I'm experiencing a bug when using petsc4py with GPU support. It may be my
> mistake in how I set up a AIJCUSPARSE matrix.
> For larger matrices, I sometimes encounter a error in assigning matrix
> values; the error is thrown in PetscHMapIJVQuerySet().
> Here is a minimum snippet that populates a sparse tridiagonal matrix.
>
> ```
> from petsc4py import PETSc
> from scipy.sparse import diags
> import numpy as np
>
> n = int(5e5);
>
> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
> A.setType('aijcusparse')
> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
> ### this is the line where the error is thrown.
> A.assemble()
> ```
>
> The error trace is below:
> ```
> File "petsc4py/PETSc/Mat.pyx", line 2603, in
> petsc4py.PETSc.Mat.setValuesCSR
>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in
> petsc4py.PETSc.matsetvalues_csr
>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in
> petsc4py.PETSc.matsetvalues_ijv
> petsc4py.PETSc.Error: error code 76
> [0] MatSetValues() at
> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
> [0] MatSetValues_Seq_Hash() at
> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
> [0] PetscHMapIJVQuerySet() at
> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
> [0] Error in external library
> [0] [khash] Assertion: `ret >= 0' failed.
> ```
>
> If I run the same script a handful of times, it will run without errors
> eventually.
> Does anyone have insight on why it is behaving this way? I'm running on a
> node with 3x NVIDIA A100 PCIE 40GB.
>
> Thank you!
> Anna
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Yesypenko, Anna
Hi Matt, Barry,

Apologies for the extra dependency on scipy. I can replicate the error by 
calling setValue (i,j,v) in a loop as well.
In roughly half of 10 runs, the following script fails because of an error in 
hashmapijv – the same as my original post.
It successfully runs without error the other times.

Barry is right that it's CUDA specific. The script runs fine on the CPU.
Do you have any suggestions or example scripts on assigning entries to a 
AIJCUSPARSE matrix?

Here is a minimum snippet that doesn't depend on scipy.
```
from petsc4py import PETSc
import numpy as np

n = int(5e5);
nnz = 3 * np.ones(n, dtype=np.int32)
nnz[0] = nnz[-1] = 2
A = PETSc.Mat(comm=PETSc.COMM_WORLD)
A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
A.setType('aijcusparse')

A.setValue(0, 0, 2)
A.setValue(0, 1, -1)
A.setValue(n-1, n-2, -1)
A.setValue(n-1, n-1, 2)

for index in range(1, n - 1):
 A.setValue(index, index - 1, -1)
 A.setValue(index, index, 2)
 A.setValue(index, index + 1, -1)
A.assemble()
```
If it means anything to you, when the hash error occurs, it is for index 67283 
after filling 201851 nonzero values.

Thank you for your help and suggestions!
Anna


From: Barry Smith 
Sent: Thursday, January 18, 2024 2:35 PM
To: Yesypenko, Anna 
Cc: petsc-users@mcs.anl.gov 
Subject: Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix


   Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' to 
confirm it doesn't fail then?



   Barry


On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna  wrote:

Dear Petsc users/developers,

I'm experiencing a bug when using petsc4py with GPU support. It may be my 
mistake in how I set up a AIJCUSPARSE matrix.
For larger matrices, I sometimes encounter a error in assigning matrix values; 
the error is thrown in PetscHMapIJVQuerySet().
Here is a minimum snippet that populates a sparse tridiagonal matrix.

```
from petsc4py import PETSc
from scipy.sparse import diags
import numpy as np

n = int(5e5);

nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
A = PETSc.Mat(comm=PETSc.COMM_WORLD)
A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
A.setType('aijcusparse')
tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
### this is the line where the error is thrown.
A.assemble()
```

The error trace is below:
```
File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
  File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
petsc4py.PETSc.matsetvalues_csr
  File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
petsc4py.PETSc.matsetvalues_ijv
petsc4py.PETSc.Error: error code 76
[0] MatSetValues() at 
/work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
[0] MatSetValues_Seq_Hash() at 
/work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
[0] PetscHMapIJVQuerySet() at 
/work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
[0] Error in external library
[0] [khash] Assertion: `ret >= 0' failed.
```

If I run the same script a handful of times, it will run without errors 
eventually.
Does anyone have insight on why it is behaving this way? I'm running on a node 
with 3x NVIDIA A100 PCIE 40GB.

Thank you!
Anna



Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   It appears to be crashing in kh_resize() in khash.h on a memory allocation 
failure when it tries to get additional memory for storing the matrix.

   This code seems to be only using the CPU memory so it should also fail in a 
similar way with 'aij'.   

  But the matrix is not large and so I don't think it should be running out of 
memory. I cannot reproduce the crash with same parameters on my non-CUDA 
machine so debugging will be tricky.

   Barry






> On Jan 18, 2024, at 3:35 PM, Barry Smith  wrote:
> 
> 
>Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' 
> to confirm it doesn't fail then?
> 
>
> 
>Barry
> 
> 
>> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna  wrote:
>> 
>> Dear Petsc users/developers,
>> 
>> I'm experiencing a bug when using petsc4py with GPU support. It may be my 
>> mistake in how I set up a AIJCUSPARSE matrix.
>> For larger matrices, I sometimes encounter a error in assigning matrix 
>> values; the error is thrown in PetscHMapIJVQuerySet().
>> Here is a minimum snippet that populates a sparse tridiagonal matrix. 
>> 
>> ```
>> from petsc4py import PETSc
>> from scipy.sparse import diags
>> import numpy as np
>> 
>> n = int(5e5); 
>> 
>> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
>> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
>> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
>> A.setType('aijcusparse')
>> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
>> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
>> ### this is the line where the error is thrown.
>> A.assemble()
>> ```
>> 
>> The error trace is below:
>> ```
>> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
>>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
>> petsc4py.PETSc.matsetvalues_csr
>>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
>> petsc4py.PETSc.matsetvalues_ijv
>> petsc4py.PETSc.Error: error code 76
>> [0] MatSetValues() at 
>> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
>> [0] MatSetValues_Seq_Hash() at 
>> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
>> [0] PetscHMapIJVQuerySet() at 
>> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
>> [0] Error in external library
>> [0] [khash] Assertion: `ret >= 0' failed.
>> ```
>> 
>> If I run the same script a handful of times, it will run without errors 
>> eventually.
>> Does anyone have insight on why it is behaving this way? I'm running on a 
>> node with 3x NVIDIA A100 PCIE 40GB.
>> 
>> Thank you!
>> Anna
> 



Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Barry Smith

   Do you ever get a problem with 'aij` ?   Can you run in a loop with 'aij' to 
confirm it doesn't fail then?

   

   Barry


> On Jan 17, 2024, at 4:51 PM, Yesypenko, Anna  wrote:
> 
> Dear Petsc users/developers,
> 
> I'm experiencing a bug when using petsc4py with GPU support. It may be my 
> mistake in how I set up a AIJCUSPARSE matrix.
> For larger matrices, I sometimes encounter a error in assigning matrix 
> values; the error is thrown in PetscHMapIJVQuerySet().
> Here is a minimum snippet that populates a sparse tridiagonal matrix. 
> 
> ```
> from petsc4py import PETSc
> from scipy.sparse import diags
> import numpy as np
> 
> n = int(5e5); 
> 
> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
> A.setType('aijcusparse')
> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
> ### this is the line where the error is thrown.
> A.assemble()
> ```
> 
> The error trace is below:
> ```
> File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
> petsc4py.PETSc.matsetvalues_csr
>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
> petsc4py.PETSc.matsetvalues_ijv
> petsc4py.PETSc.Error: error code 76
> [0] MatSetValues() at 
> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
> [0] MatSetValues_Seq_Hash() at 
> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
> [0] PetscHMapIJVQuerySet() at 
> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
> [0] Error in external library
> [0] [khash] Assertion: `ret >= 0' failed.
> ```
> 
> If I run the same script a handful of times, it will run without errors 
> eventually.
> Does anyone have insight on why it is behaving this way? I'm running on a 
> node with 3x NVIDIA A100 PCIE 40GB.
> 
> Thank you!
> Anna



Re: [petsc-users] ScaLAPACK EPS error

2024-01-18 Thread Peder Jørgensgaard Olesen via petsc-users
It appears my setup doesn't allow me to use versions > 3.17.4, unfortunately (I 
believe I'll need to speak to admin for this).

Best,
Peder

Fra: Barry Smith 
Sendt: 18. januar 2024 19:29
Til: Peder Jørgensgaard Olesen 
Cc: petsc-users@mcs.anl.gov 
Emne: Re: [petsc-users] ScaLAPACK EPS error


   Looks like you are using an older version of PETSc. Could you please switch 
to the latest and try again and send same information if that also fails.

  Barry


On Jan 18, 2024, at 12:59 PM, Peder Jørgensgaard Olesen via petsc-users 
 wrote:

Hello,

I need to determine the full set of eigenpairs to a rather large (N=16,000) 
dense Hermitian matrix. I've managed to do this using SLEPc's standard 
Krylov-Schur EPS, but I think it could be done more efficiently using 
ScaLAPACK. I receive the following error when attempting this. As I understand 
it, descinit is used to initialize an array, and the variable in question 
designates the leading dimension of the array, for which it seems an illegal 
value is somehow passed.

I know ScaLAPACK is an external package, but it seems as if the error would be 
in the call from SLEPc. Any ideas as to what could cause this?

Thanks,
Peder

Error message (excerpt):

PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032
PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250
PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47
PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323
PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134
PETSC ERROR: -- Error message --
PETSC ERROR: Error in external library
PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9
(...)

Log file (excerpt):
{  357,0}:  On entry to DESCINIT parameter number   9 had an illegal value
[and a few hundred lines similar to this]



Re: [petsc-users] Swarm view HDF5

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 1:20 PM Mark Adams  wrote:

> I had this working at one point.
>
> Should I add PetscViewerHDF5PushTimestepping?
> I don't create a viewer now, but I could make one.
>

That will make it work. The real fix would be to check at swarm.c:45 to see
whether timestepping is set.

  Thanks,

Matt


> Thanks,
> Mark
>
>
> On Thu, Jan 18, 2024 at 11:26 AM Matthew Knepley 
> wrote:
>
>> On Thu, Jan 18, 2024 at 10:08 AM Mark Adams  wrote:
>>
>>> I am trying to view a DMSwarm with: -weights_view hdf5:part.h5
>>>
>>>  Vec f;
>>>   PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0));
>>>   PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", ));
>>>   PetscCall(PetscObjectSetName((PetscObject)f, "particle weights"));
>>>   PetscCall(VecViewFromOptions(f, NULL, "-weights_view"));
>>>   PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", ));
>>>
>>> And I get this error. I had this working once and did not set
>>> PetscViewerHDF5PushTimestepping, so I wanted to check.
>>>
>>
>> We probably were not checking then. We might have to check there when we
>> set the timestep.
>>
>>   Thanks,
>>
>> Matt
>>
>>
>>> Thanks,
>>> Mark
>>>
>>>
>>> [0]PETSC ERROR: Object is in wrong state
>>> [0]PETSC ERROR: Timestepping has not been pushed yet. Call
>>> PetscViewerHDF5PushTimestepping() first
>>> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the
>>> program crashed before usage or a spelling mistake, etc!
>>> [0]PETSC ERROR:   Option left: name:-options_left (no value) source:
>>> command line
>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688
>>>  GIT Date: 2024-01-16 23:32:45 +
>>> [0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local
>>> by markadams Thu Jan 18 10:05:53 2024
>>> [0]PETSC ERROR: Configure options CFLAGS="-g
>>> -Wno-deprecated-declarations " CXXFLAGS="-g -Wno-deprecated-declarations "
>>> COPTFLAGS=-O CXXOPTFLAGS=-O --with-cc=/usr/local/opt/llvm/bin/clang
>>> --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich
>>> --with-strict-petscerrorcode --download-triangle=1 --with-debugging=0
>>> --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O
>>> [0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at
>>> /Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990
>>> [0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at
>>> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45
>>> [0]PETSC ERROR: #3 VecView_Swarm() at
>>> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86
>>> [0]PETSC ERROR: #4 VecView() at
>>> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806
>>> [0]PETSC ERROR: #5 PetscObjectView() at
>>> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76
>>> [0]PETSC ERROR: #6 PetscObjectViewFromOptions() at
>>> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128
>>> [0]PETSC ERROR: #7 VecViewFromOptions() at
>>> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] ScaLAPACK EPS error

2024-01-18 Thread Peder Jørgensgaard Olesen via petsc-users
I set up the matrix using MatCreateDense(), passing PETSC_DECIDE for the local 
dimensions.

The same error appears with 8, 12, and 16 nodes (32 proc/node).

I'll have to get back to you regarding a minimal example.

Best,
Peder

Fra: Jose E. Roman 
Sendt: 18. januar 2024 19:28
Til: Peder Jørgensgaard Olesen 
Cc: petsc-users@mcs.anl.gov 
Emne: Re: [petsc-users] ScaLAPACK EPS error

How are you setting up your input matrix? Are you giving the local sizes or 
setting them to PETSC_DECIDE?
Do you get the same error for different number of MPI processes?
Can you send a small code reproducing the error?

Jose


> El 18 ene 2024, a las 18:59, Peder Jørgensgaard Olesen via petsc-users 
>  escribió:
>
> Hello,
>
> I need to determine the full set of eigenpairs to a rather large (N=16,000) 
> dense Hermitian matrix. I've managed to do this using SLEPc's standard 
> Krylov-Schur EPS, but I think it could be done more efficiently using 
> ScaLAPACK. I receive the following error when attempting this. As I 
> understand it, descinit is used to initialize an array, and the variable in 
> question designates the leading dimension of the array, for which it seems an 
> illegal value is somehow passed.
>
> I know ScaLAPACK is an external package, but it seems as if the error would 
> be in the call from SLEPc. Any ideas as to what could cause this?
>
> Thanks,
> Peder
>
> Error message (excerpt):
>
> PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032
> PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250
> PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47
> PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323
> PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134
> PETSC ERROR: -- Error message --
> PETSC ERROR: Error in external library
> PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9
> (...)
>
> Log file (excerpt):
> {  357,0}:  On entry to DESCINIT parameter number   9 had an illegal value
> [and a few hundred lines similar to this]




Re: [petsc-users] ScaLAPACK EPS error

2024-01-18 Thread Barry Smith

   Looks like you are using an older version of PETSc. Could you please switch 
to the latest and try again and send same information if that also fails.

  Barry


> On Jan 18, 2024, at 12:59 PM, Peder Jørgensgaard Olesen via petsc-users 
>  wrote:
> 
> Hello,
> 
> I need to determine the full set of eigenpairs to a rather large (N=16,000) 
> dense Hermitian matrix. I've managed to do this using SLEPc's standard 
> Krylov-Schur EPS, but I think it could be done more efficiently using 
> ScaLAPACK. I receive the following error when attempting this. As I 
> understand it, descinit is used to initialize an array, and the variable in 
> question designates the leading dimension of the array, for which it seems an 
> illegal value is somehow passed.
> 
> I know ScaLAPACK is an external package, but it seems as if the error would 
> be in the call from SLEPc. Any ideas as to what could cause this?
> 
> Thanks,
> Peder
> 
> Error message (excerpt):
> 
> PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032
> PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250
> PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47
> PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323
> PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134
> PETSC ERROR: -- Error message --
> PETSC ERROR: Error in external library
> PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9
> (...)
> 
> Log file (excerpt):
> {  357,0}:  On entry to DESCINIT parameter number   9 had an illegal value
> [and a few hundred lines similar to this]



Re: [petsc-users] ScaLAPACK EPS error

2024-01-18 Thread Jose E. Roman
How are you setting up your input matrix? Are you giving the local sizes or 
setting them to PETSC_DECIDE?
Do you get the same error for different number of MPI processes?
Can you send a small code reproducing the error?

Jose


> El 18 ene 2024, a las 18:59, Peder Jørgensgaard Olesen via petsc-users 
>  escribió:
> 
> Hello,
> 
> I need to determine the full set of eigenpairs to a rather large (N=16,000) 
> dense Hermitian matrix. I've managed to do this using SLEPc's standard 
> Krylov-Schur EPS, but I think it could be done more efficiently using 
> ScaLAPACK. I receive the following error when attempting this. As I 
> understand it, descinit is used to initialize an array, and the variable in 
> question designates the leading dimension of the array, for which it seems an 
> illegal value is somehow passed.
> 
> I know ScaLAPACK is an external package, but it seems as if the error would 
> be in the call from SLEPc. Any ideas as to what could cause this?
> 
> Thanks,
> Peder
> 
> Error message (excerpt):
> 
> PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032
> PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250
> PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47
> PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323
> PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134
> PETSC ERROR: -- Error message --
> PETSC ERROR: Error in external library
> PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9
> (...)
> 
> Log file (excerpt):
> {  357,0}:  On entry to DESCINIT parameter number   9 had an illegal value
> [and a few hundred lines similar to this]




Re: [petsc-users] Swarm view HDF5

2024-01-18 Thread Mark Adams
I had this working at one point.

Should I add PetscViewerHDF5PushTimestepping?
I don't create a viewer now, but I could make one.

Thanks,
Mark


On Thu, Jan 18, 2024 at 11:26 AM Matthew Knepley  wrote:

> On Thu, Jan 18, 2024 at 10:08 AM Mark Adams  wrote:
>
>> I am trying to view a DMSwarm with: -weights_view hdf5:part.h5
>>
>>  Vec f;
>>   PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0));
>>   PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", ));
>>   PetscCall(PetscObjectSetName((PetscObject)f, "particle weights"));
>>   PetscCall(VecViewFromOptions(f, NULL, "-weights_view"));
>>   PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", ));
>>
>> And I get this error. I had this working once and did not set
>> PetscViewerHDF5PushTimestepping, so I wanted to check.
>>
>
> We probably were not checking then. We might have to check there when we
> set the timestep.
>
>   Thanks,
>
> Matt
>
>
>> Thanks,
>> Mark
>>
>>
>> [0]PETSC ERROR: Object is in wrong state
>> [0]PETSC ERROR: Timestepping has not been pushed yet. Call
>> PetscViewerHDF5PushTimestepping() first
>> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the
>> program crashed before usage or a spelling mistake, etc!
>> [0]PETSC ERROR:   Option left: name:-options_left (no value) source:
>> command line
>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688
>>  GIT Date: 2024-01-16 23:32:45 +
>> [0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local
>> by markadams Thu Jan 18 10:05:53 2024
>> [0]PETSC ERROR: Configure options CFLAGS="-g -Wno-deprecated-declarations
>> " CXXFLAGS="-g -Wno-deprecated-declarations " COPTFLAGS=-O CXXOPTFLAGS=-O
>> --with-cc=/usr/local/opt/llvm/bin/clang
>> --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich
>> --with-strict-petscerrorcode --download-triangle=1 --with-debugging=0
>> --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O
>> [0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at
>> /Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990
>> [0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at
>> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45
>> [0]PETSC ERROR: #3 VecView_Swarm() at
>> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86
>> [0]PETSC ERROR: #4 VecView() at
>> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806
>> [0]PETSC ERROR: #5 PetscObjectView() at
>> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76
>> [0]PETSC ERROR: #6 PetscObjectViewFromOptions() at
>> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128
>> [0]PETSC ERROR: #7 VecViewFromOptions() at
>> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>


[petsc-users] ScaLAPACK EPS error

2024-01-18 Thread Peder Jørgensgaard Olesen via petsc-users
Hello,

I need to determine the full set of eigenpairs to a rather large (N=16,000) 
dense Hermitian matrix. I've managed to do this using SLEPc's standard 
Krylov-Schur EPS, but I think it could be done more efficiently using 
ScaLAPACK. I receive the following error when attempting this. As I understand 
it, descinit is used to initialize an array, and the variable in question 
designates the leading dimension of the array, for which it seems an illegal 
value is somehow passed.

I know ScaLAPACK is an external package, but it seems as if the error would be 
in the call from SLEPc. Any ideas as to what could cause this?

Thanks,
Peder

Error message (excerpt):

PETSC ERROR: #1 MatConvert_Dense_ScaLAPACK() at [...]/matscalapack.c:1032
PETSC ERROR: #2 MatConvert at [...]/matrix.c:4250
PETSC ERROR: #3 EPSSetUp_ScaLAPACK() at [...]/scalapack.c:47
PETSC ERROR: #4 EPSSetUp() at [...]/epssetup.c:323
PETSC ERROR: #5 EPSSolve at [...]/epssolve.c:134
PETSC ERROR: -- Error message --
PETSC ERROR: Error in external library
PETSC ERROR: Error in ScaLAPACK subroutine descinit: info=-9
(...)

Log file (excerpt):
{  357,0}:  On entry to DESCINIT parameter number   9 had an illegal value
[and a few hundred lines similar to this]


Re: [petsc-users] undefined reference to `petsc_allreduce_ct_th'

2024-01-18 Thread Aaron Scheinberg
Thanks, it turns out there was another installation of PETSc, and it was
linking with the wrong one. It builds now.

On Thu, Jan 18, 2024 at 12:03 PM Barry Smith  wrote:

>
>The PETSc petsclog.h  (included by petscsys.h) uses C macro magic to
> log calls to MPI routines. This is how the symbol is getting into your
> code. But normally
> if you use PetscInitialize() and link to the PETSc library the symbol
> would get resolved.
>
>If that part of the code does not need PETSc at all you can not include
> petscsys.h and instead include mpi.h otherwise you need to track down why
> when your code gets linked against PETSc libraries that symbol is not
> resolved.
>
>   Barry
>
>
> On Jan 18, 2024, at 11:55 AM, Aaron Scheinberg 
> wrote:
>
> Hello,
>
> I'm getting this error when linking:
>
> undefined reference to `petsc_allreduce_ct_th'
>
> The instances are regular MPI_Allreduces in my code that are not located
> in parts of the code related to PETSc, so I'm wondering what is happening
> to involve PETSc here? Can I configure it to avoid that? I consulted
> google, the FAQ and skimmed other documentation but didn't see anything.
> Thanks!
>
> Aaron
>
>
>


Re: [petsc-users] undefined reference to `petsc_allreduce_ct_th'

2024-01-18 Thread Barry Smith

   The PETSc petsclog.h  (included by petscsys.h) uses C macro magic to log 
calls to MPI routines. This is how the symbol is getting into your code. But 
normally 
if you use PetscInitialize() and link to the PETSc library the symbol would get 
resolved.

   If that part of the code does not need PETSc at all you can not include 
petscsys.h and instead include mpi.h otherwise you need to track down why when 
your code gets linked against PETSc libraries that symbol is not resolved.

  Barry


> On Jan 18, 2024, at 11:55 AM, Aaron Scheinberg  wrote:
> 
> Hello,
> 
> I'm getting this error when linking:
> 
> undefined reference to `petsc_allreduce_ct_th'
> 
> The instances are regular MPI_Allreduces in my code that are not located in 
> parts of the code related to PETSc, so I'm wondering what is happening to 
> involve PETSc here? Can I configure it to avoid that? I consulted google, the 
> FAQ and skimmed other documentation but didn't see anything. Thanks!
> 
> Aaron



Re: [petsc-users] undefined reference to `petsc_allreduce_ct_th'

2024-01-18 Thread Satish Balay via petsc-users


On Thu, 18 Jan 2024, Aaron Scheinberg wrote:

> Hello,
> 
> I'm getting this error when linking:
> 
> undefined reference to `petsc_allreduce_ct_th'
> 
> The instances are regular MPI_Allreduces in my code that are not located in
> parts of the code related to PETSc, so I'm wondering what is happening to
> involve PETSc here? 

This symbol should be in libpetsc.so. Are you including petsc.h - but not 
linking in -lpetsc - from your code?

balay@pj01:~/petsc/arch-linux-c-debug/lib$ nm -Ao libpetsc.so |grep 
petsc_allreduce_ct_th
libpetsc.so:04279a50 B petsc_allreduce_ct_th

> Can I configure it to avoid that? I consulted google,
> the FAQ and skimmed other documentation but didn't see anything. Thanks!

If you wish to avoid petsc logging of MPI messages (but include petsc.h in your 
code?) - you can use in your code:


#define PETSC_HAVE_BROKEN_RECURSIVE_MACRO
#include 


Or build it with -DPETSC_HAVE_BROKEN_RECURSIVE_MACRO compiler option

Satish


[petsc-users] undefined reference to `petsc_allreduce_ct_th'

2024-01-18 Thread Aaron Scheinberg
Hello,

I'm getting this error when linking:

undefined reference to `petsc_allreduce_ct_th'

The instances are regular MPI_Allreduces in my code that are not located in
parts of the code related to PETSc, so I'm wondering what is happening to
involve PETSc here? Can I configure it to avoid that? I consulted google,
the FAQ and skimmed other documentation but didn't see anything. Thanks!

Aaron


Re: [petsc-users] Swarm view HDF5

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 10:08 AM Mark Adams  wrote:

> I am trying to view a DMSwarm with: -weights_view hdf5:part.h5
>
>  Vec f;
>   PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0));
>   PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", ));
>   PetscCall(PetscObjectSetName((PetscObject)f, "particle weights"));
>   PetscCall(VecViewFromOptions(f, NULL, "-weights_view"));
>   PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", ));
>
> And I get this error. I had this working once and did not set
> PetscViewerHDF5PushTimestepping, so I wanted to check.
>

We probably were not checking then. We might have to check there when we
set the timestep.

  Thanks,

Matt


> Thanks,
> Mark
>
>
> [0]PETSC ERROR: Object is in wrong state
> [0]PETSC ERROR: Timestepping has not been pushed yet. Call
> PetscViewerHDF5PushTimestepping() first
> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the
> program crashed before usage or a spelling mistake, etc!
> [0]PETSC ERROR:   Option left: name:-options_left (no value) source:
> command line
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688
>  GIT Date: 2024-01-16 23:32:45 +
> [0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local by
> markadams Thu Jan 18 10:05:53 2024
> [0]PETSC ERROR: Configure options CFLAGS="-g -Wno-deprecated-declarations
> " CXXFLAGS="-g -Wno-deprecated-declarations " COPTFLAGS=-O CXXOPTFLAGS=-O
> --with-cc=/usr/local/opt/llvm/bin/clang
> --with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich
> --with-strict-petscerrorcode --download-triangle=1 --with-debugging=0
> --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O
> [0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at
> /Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990
> [0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at
> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45
> [0]PETSC ERROR: #3 VecView_Swarm() at
> /Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86
> [0]PETSC ERROR: #4 VecView() at
> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806
> [0]PETSC ERROR: #5 PetscObjectView() at
> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76
> [0]PETSC ERROR: #6 PetscObjectViewFromOptions() at
> /Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128
> [0]PETSC ERROR: #7 VecViewFromOptions() at
> /Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] Undestanding how to increase the overlap

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 10:05 AM TARDIEU Nicolas 
wrote:

> Arrghhh ! Shame on me !
> Sorry for that and thank you for the help Matt.
> I works like a charm now.
>

No problem. I like when I figure it out :)

  Thanks,

 Matt


> --
> *De :* knep...@gmail.com 
> *Envoyé :* jeudi 18 janvier 2024 15:49
> *À :* TARDIEU Nicolas 
> *Cc :* petsc-users@mcs.anl.gov 
> *Objet :* Re: [petsc-users] Undestanding how to increase the overlap
>
> On Thu, Jan 18, 2024 at 9:46 AM TARDIEU Nicolas 
> wrote:
>
> Hi Matt,
> The isovl is in global numbering. I am pasting the output of the script  :
>
>
> I see. Is your matrix diagonal? If so, there is no overlap. You need
> connections to the other rows in order to have overlapping submatrices.
>
>   Thanks,
>
>  Matt
>
>
>
> #
> # The matrix
> # ---
> [1,0]:Mat Object: 2 MPI processes
> [1,0]:  type: mpiaij
> [1,0]:row 0: (0, 0.)
> [1,0]:row 1: (1, 1.)
> [1,0]:row 2: (2, 2.)
> [1,0]:row 3: (3, 3.)
> [1,0]:row 4: (4, 4.)
> [1,0]:row 5: (5, 5.)
>
> #
> # The IS isovl before the call to increaseOverlap
> # ---
> [1,0]:locSize before = 3
> [1,0]:IS Object: 1 MPI processes
> [1,0]:  type: stride
> [1,0]:Number of indices in (stride) set 3
> [1,0]:0 0
> [1,0]:1 1
> [1,0]:2 2
> [1,1]:locSize before = 3
> [1,1]:IS Object: 1 MPI processes
> [1,1]:  type: stride
> [1,1]:Number of indices in (stride) set 3
> [1,1]:0 3
> [1,1]:1 4
> [1,1]:2 5
>
> #
> # The IS isovl after the call to increaseOverlap
> # ---
> [1,0]:locSize after = 3
> [1,0]:IS Object: 1 MPI processes
> [1,0]:  type: general
> [1,0]:Number of indices in set 3
> [1,0]:0 0
> [1,0]:1 1
> [1,0]:2 2
> [1,1]:locSize after = 3
> [1,1]:IS Object: 1 MPI processes
> [1,1]:  type: general
> [1,1]:Number of indices in set 3
> [1,1]:0 3
> [1,1]:1 4
> [1,1]:2 5
>
> #
>
> Regards,
> Nicolas
> --
> Nicolas Tardieu
> Ing PhD Computational Mechanics
> EDF - R Dpt ERMES
> PARIS-SACLAY, FRANCE
> --
> *De :* knep...@gmail.com 
> *Envoyé :* jeudi 18 janvier 2024 15:29
> *À :* TARDIEU Nicolas 
> *Cc :* petsc-users@mcs.anl.gov 
> *Objet :* Re: [petsc-users] Undestanding how to increase the overlap
>
> On Thu, Jan 18, 2024 at 9:24 AM TARDIEU Nicolas via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
> Dear PETSc Team,
>
> I am trying to understand how to increase the overlap of a matrix.
> I wrote the attached petsc4py script where I build a simple matrix and
> play with the increaseOverlap method. Unfortunately, before and after the
> call, nothing changes in the index set. FYI, I have tried to mimic
> src/ksp/ksp/tutorials/ex82.c
>  line 72:76.
> Here is how I run the script : "mpiexec -n 2 python test_overlap.py"
>
> Could you please indicate what I am missing ?
>
>
> Usually matrix functions like this take input in global indices. It looks
> like your isovl is in local indices. Am I reading that correctly?
>
>   Thanks,
>
>  Matt
>
>
> Regards,
> Nicolas
> --
> Nicolas Tardieu
> Ing PhD Computational Mechanics
> EDF - R Dpt ERMES
> PARIS-SACLAY, FRANCE
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme à sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions également d'en
> avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
> erreur ou virus.
> 
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> purpose, any dissemination or disclosure, either whole or partial, is
> prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use
> any part of it. If you have received this message in error, please delete
> it and all copies from your system and 

[petsc-users] Swarm view HDF5

2024-01-18 Thread Mark Adams
I am trying to view a DMSwarm with: -weights_view hdf5:part.h5

 Vec f;
  PetscCall(DMSetOutputSequenceNumber(sw, 0, 0.0));
  PetscCall(DMSwarmCreateGlobalVectorFromField(sw, "w_q", ));
  PetscCall(PetscObjectSetName((PetscObject)f, "particle weights"));
  PetscCall(VecViewFromOptions(f, NULL, "-weights_view"));
  PetscCall(DMSwarmDestroyGlobalVectorFromField(sw, "w_q", ));

And I get this error. I had this working once and did not set
PetscViewerHDF5PushTimestepping, so I wanted to check.

Thanks,
Mark


[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: Timestepping has not been pushed yet. Call
PetscViewerHDF5PushTimestepping() first
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the
program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR:   Option left: name:-options_left (no value) source:
command line
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.20.3-461-g585a01bd688
 GIT Date: 2024-01-16 23:32:45 +
[0]PETSC ERROR: ./ex30k on a arch-macosx-gnu-O named MarksMac-302.local by
markadams Thu Jan 18 10:05:53 2024
[0]PETSC ERROR: Configure options CFLAGS="-g -Wno-deprecated-declarations "
CXXFLAGS="-g -Wno-deprecated-declarations " COPTFLAGS=-O CXXOPTFLAGS=-O
--with-cc=/usr/local/opt/llvm/bin/clang
--with-cxx=/usr/local/opt/llvm/bin/clang++ --download-mpich
--with-strict-petscerrorcode --download-triangle=1 --with-debugging=0
--download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O
[0]PETSC ERROR: #1 PetscViewerHDF5SetTimestep() at
/Users/markadams/Codes/petsc/src/sys/classes/viewer/impls/hdf5/hdf5v.c:990
[0]PETSC ERROR: #2 VecView_Swarm_HDF5_Internal() at
/Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:45
[0]PETSC ERROR: #3 VecView_Swarm() at
/Users/markadams/Codes/petsc/src/dm/impls/swarm/swarm.c:86
[0]PETSC ERROR: #4 VecView() at
/Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:806
[0]PETSC ERROR: #5 PetscObjectView() at
/Users/markadams/Codes/petsc/src/sys/objects/destroy.c:76
[0]PETSC ERROR: #6 PetscObjectViewFromOptions() at
/Users/markadams/Codes/petsc/src/sys/objects/destroy.c:128
[0]PETSC ERROR: #7 VecViewFromOptions() at
/Users/markadams/Codes/petsc/src/vec/vec/interface/vector.c:691


Re: [petsc-users] Undestanding how to increase the overlap

2024-01-18 Thread TARDIEU Nicolas via petsc-users
Arrghhh ! Shame on me !
Sorry for that and thank you for the help Matt.
I works like a charm now.

De : knep...@gmail.com 
Envoyé : jeudi 18 janvier 2024 15:49
À : TARDIEU Nicolas 
Cc : petsc-users@mcs.anl.gov 
Objet : Re: [petsc-users] Undestanding how to increase the overlap

On Thu, Jan 18, 2024 at 9:46 AM TARDIEU Nicolas 
mailto:nicolas.tard...@edf.fr>> wrote:
Hi Matt,
The isovl is in global numbering. I am pasting the output of the script  :

I see. Is your matrix diagonal? If so, there is no overlap. You need 
connections to the other rows in order to have overlapping submatrices.

  Thanks,

 Matt

#
# The matrix
# ---
[1,0]:Mat Object: 2 MPI processes
[1,0]:  type: mpiaij
[1,0]:row 0: (0, 0.)
[1,0]:row 1: (1, 1.)
[1,0]:row 2: (2, 2.)
[1,0]:row 3: (3, 3.)
[1,0]:row 4: (4, 4.)
[1,0]:row 5: (5, 5.)
#
# The IS isovl before the call to increaseOverlap
# ---
[1,0]:locSize before = 3
[1,0]:IS Object: 1 MPI processes
[1,0]:  type: stride
[1,0]:Number of indices in (stride) set 3
[1,0]:0 0
[1,0]:1 1
[1,0]:2 2
[1,1]:locSize before = 3
[1,1]:IS Object: 1 MPI processes
[1,1]:  type: stride
[1,1]:Number of indices in (stride) set 3
[1,1]:0 3
[1,1]:1 4
[1,1]:2 5
#
# The IS isovl after the call to increaseOverlap
# ---
[1,0]:locSize after = 3
[1,0]:IS Object: 1 MPI processes
[1,0]:  type: general
[1,0]:Number of indices in set 3
[1,0]:0 0
[1,0]:1 1
[1,0]:2 2
[1,1]:locSize after = 3
[1,1]:IS Object: 1 MPI processes
[1,1]:  type: general
[1,1]:Number of indices in set 3
[1,1]:0 3
[1,1]:1 4
[1,1]:2 5
#

Regards,
Nicolas
--
Nicolas Tardieu
Ing PhD Computational Mechanics
EDF - R Dpt ERMES
PARIS-SACLAY, FRANCE

De : knep...@gmail.com 
mailto:knep...@gmail.com>>
Envoyé : jeudi 18 janvier 2024 15:29
À : TARDIEU Nicolas mailto:nicolas.tard...@edf.fr>>
Cc : petsc-users@mcs.anl.gov 
mailto:petsc-users@mcs.anl.gov>>
Objet : Re: [petsc-users] Undestanding how to increase the overlap

On Thu, Jan 18, 2024 at 9:24 AM TARDIEU Nicolas via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
Dear PETSc Team,

I am trying to understand how to increase the overlap of a matrix.
I wrote the attached petsc4py script where I build a simple matrix and play 
with the increaseOverlap method. Unfortunately, before and after the call, 
nothing changes in the index set. FYI, I have tried to mimic 
src/ksp/ksp/tutorials/ex82.c
 line 72:76.
Here is how I run the script : "mpiexec -n 2 python test_overlap.py"

Could you please indicate what I am missing ?

Usually matrix functions like this take input in global indices. It looks like 
your isovl is in local indices. Am I reading that correctly?

  Thanks,

 Matt

Regards,
Nicolas
--
Nicolas Tardieu
Ing PhD Computational Mechanics
EDF - R Dpt ERMES
PARIS-SACLAY, FRANCE

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


--
What most experimenters take for granted before they begin their experiments is 

Re: [petsc-users] Undestanding how to increase the overlap

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 9:46 AM TARDIEU Nicolas 
wrote:

> Hi Matt,
> The isovl is in global numbering. I am pasting the output of the script  :
>

I see. Is your matrix diagonal? If so, there is no overlap. You need
connections to the other rows in order to have overlapping submatrices.

  Thanks,

 Matt


>
> #
> # The matrix
> # ---
> [1,0]:Mat Object: 2 MPI processes
> [1,0]:  type: mpiaij
> [1,0]:row 0: (0, 0.)
> [1,0]:row 1: (1, 1.)
> [1,0]:row 2: (2, 2.)
> [1,0]:row 3: (3, 3.)
> [1,0]:row 4: (4, 4.)
> [1,0]:row 5: (5, 5.)
>
> #
> # The IS isovl before the call to increaseOverlap
> # ---
> [1,0]:locSize before = 3
> [1,0]:IS Object: 1 MPI processes
> [1,0]:  type: stride
> [1,0]:Number of indices in (stride) set 3
> [1,0]:0 0
> [1,0]:1 1
> [1,0]:2 2
> [1,1]:locSize before = 3
> [1,1]:IS Object: 1 MPI processes
> [1,1]:  type: stride
> [1,1]:Number of indices in (stride) set 3
> [1,1]:0 3
> [1,1]:1 4
> [1,1]:2 5
>
> #
> # The IS isovl after the call to increaseOverlap
> # ---
> [1,0]:locSize after = 3
> [1,0]:IS Object: 1 MPI processes
> [1,0]:  type: general
> [1,0]:Number of indices in set 3
> [1,0]:0 0
> [1,0]:1 1
> [1,0]:2 2
> [1,1]:locSize after = 3
> [1,1]:IS Object: 1 MPI processes
> [1,1]:  type: general
> [1,1]:Number of indices in set 3
> [1,1]:0 3
> [1,1]:1 4
> [1,1]:2 5
>
> #
>
> Regards,
> Nicolas
> --
> Nicolas Tardieu
> Ing PhD Computational Mechanics
> EDF - R Dpt ERMES
> PARIS-SACLAY, FRANCE
> --
> *De :* knep...@gmail.com 
> *Envoyé :* jeudi 18 janvier 2024 15:29
> *À :* TARDIEU Nicolas 
> *Cc :* petsc-users@mcs.anl.gov 
> *Objet :* Re: [petsc-users] Undestanding how to increase the overlap
>
> On Thu, Jan 18, 2024 at 9:24 AM TARDIEU Nicolas via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
> Dear PETSc Team,
>
> I am trying to understand how to increase the overlap of a matrix.
> I wrote the attached petsc4py script where I build a simple matrix and
> play with the increaseOverlap method. Unfortunately, before and after the
> call, nothing changes in the index set. FYI, I have tried to mimic
> src/ksp/ksp/tutorials/ex82.c
>  line 72:76.
> Here is how I run the script : "mpiexec -n 2 python test_overlap.py"
>
> Could you please indicate what I am missing ?
>
>
> Usually matrix functions like this take input in global indices. It looks
> like your isovl is in local indices. Am I reading that correctly?
>
>   Thanks,
>
>  Matt
>
>
> Regards,
> Nicolas
> --
> Nicolas Tardieu
> Ing PhD Computational Mechanics
> EDF - R Dpt ERMES
> PARIS-SACLAY, FRANCE
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme à sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions également d'en
> avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
> erreur ou virus.
> 
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> purpose, any dissemination or disclosure, either whole or partial, is
> prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use
> any part of it. If you have received this message in error, please delete
> it and all copies from your system and notify the sender immediately by
> return message.
>
> E-mail communication cannot be guaranteed to be timely secure, error or
> virus-free.
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>
>
> Ce message et toutes les pièces jointes (ci-après le 

Re: [petsc-users] Undestanding how to increase the overlap

2024-01-18 Thread TARDIEU Nicolas via petsc-users
Hi Matt,
The isovl is in global numbering. I am pasting the output of the script  :

#
# The matrix
# ---
[1,0]:Mat Object: 2 MPI processes
[1,0]:  type: mpiaij
[1,0]:row 0: (0, 0.)
[1,0]:row 1: (1, 1.)
[1,0]:row 2: (2, 2.)
[1,0]:row 3: (3, 3.)
[1,0]:row 4: (4, 4.)
[1,0]:row 5: (5, 5.)
#
# The IS isovl before the call to increaseOverlap
# ---
[1,0]:locSize before = 3
[1,0]:IS Object: 1 MPI processes
[1,0]:  type: stride
[1,0]:Number of indices in (stride) set 3
[1,0]:0 0
[1,0]:1 1
[1,0]:2 2
[1,1]:locSize before = 3
[1,1]:IS Object: 1 MPI processes
[1,1]:  type: stride
[1,1]:Number of indices in (stride) set 3
[1,1]:0 3
[1,1]:1 4
[1,1]:2 5
#
# The IS isovl after the call to increaseOverlap
# ---
[1,0]:locSize after = 3
[1,0]:IS Object: 1 MPI processes
[1,0]:  type: general
[1,0]:Number of indices in set 3
[1,0]:0 0
[1,0]:1 1
[1,0]:2 2
[1,1]:locSize after = 3
[1,1]:IS Object: 1 MPI processes
[1,1]:  type: general
[1,1]:Number of indices in set 3
[1,1]:0 3
[1,1]:1 4
[1,1]:2 5
#

Regards,
Nicolas
--
Nicolas Tardieu
Ing PhD Computational Mechanics
EDF - R Dpt ERMES
PARIS-SACLAY, FRANCE

De : knep...@gmail.com 
Envoyé : jeudi 18 janvier 2024 15:29
À : TARDIEU Nicolas 
Cc : petsc-users@mcs.anl.gov 
Objet : Re: [petsc-users] Undestanding how to increase the overlap

On Thu, Jan 18, 2024 at 9:24 AM TARDIEU Nicolas via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
Dear PETSc Team,

I am trying to understand how to increase the overlap of a matrix.
I wrote the attached petsc4py script where I build a simple matrix and play 
with the increaseOverlap method. Unfortunately, before and after the call, 
nothing changes in the index set. FYI, I have tried to mimic 
src/ksp/ksp/tutorials/ex82.c
 line 72:76.
Here is how I run the script : "mpiexec -n 2 python test_overlap.py"

Could you please indicate what I am missing ?

Usually matrix functions like this take input in global indices. It looks like 
your isovl is in local indices. Am I reading that correctly?

  Thanks,

 Matt

Regards,
Nicolas
--
Nicolas Tardieu
Ing PhD Computational Mechanics
EDF - R Dpt ERMES
PARIS-SACLAY, FRANCE

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/



Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce 

Re: [petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 9:04 AM Yesypenko, Anna 
wrote:

> Dear Petsc users/developers,
>
> I'm experiencing a bug when using petsc4py with GPU support. It may be my
> mistake in how I set up a AIJCUSPARSE matrix.
> For larger matrices, I sometimes encounter a error in assigning matrix
> values; the error is thrown in PetscHMapIJVQuerySet().
> Here is a minimum snippet that populates a sparse tridiagonal matrix.
>
> ```
> from petsc4py import PETSc
> from scipy.sparse import diags
> import numpy as np
>
> n = int(5e5);
>
> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
> A.setType('aijcusparse')
> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
> ### this is the line where the error is thrown.
> A.assemble()
> ```
>

I don't have scipy installed. Since the matrix is so small, can you
print tmp.indptr,tmp.indices,tmp.data when you run? It seems to be either
bad values there, or something is wrong with passing those pointers.

  Thanks,

 Matt


> The error trace is below:
> ```
> File "petsc4py/PETSc/Mat.pyx", line 2603, in
> petsc4py.PETSc.Mat.setValuesCSR
>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in
> petsc4py.PETSc.matsetvalues_csr
>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in
> petsc4py.PETSc.matsetvalues_ijv
> petsc4py.PETSc.Error: error code 76
> [0] MatSetValues() at
> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
> [0] MatSetValues_Seq_Hash() at
> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
> [0] PetscHMapIJVQuerySet() at
> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
> [0] Error in external library
> [0] [khash] Assertion: `ret >= 0' failed.
> ```
>
> If I run the same script a handful of times, it will run without errors
> eventually.
> Does anyone have insight on why it is behaving this way? I'm running on a
> node with 3x NVIDIA A100 PCIE 40GB.
>
> Thank you!
> Anna
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] Undestanding how to increase the overlap

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 9:24 AM TARDIEU Nicolas via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Dear PETSc Team,
>
> I am trying to understand how to increase the overlap of a matrix.
> I wrote the attached petsc4py script where I build a simple matrix and
> play with the increaseOverlap method. Unfortunately, before and after the
> call, nothing changes in the index set. FYI, I have tried to mimic
> src/ksp/ksp/tutorials/ex82.c
>  line 72:76.
> Here is how I run the script : "mpiexec -n 2 python test_overlap.py"
>
> Could you please indicate what I am missing ?
>

Usually matrix functions like this take input in global indices. It looks
like your isovl is in local indices. Am I reading that correctly?

  Thanks,

 Matt


> Regards,
> Nicolas
> --
> Nicolas Tardieu
> Ing PhD Computational Mechanics
> EDF - R Dpt ERMES
> PARIS-SACLAY, FRANCE
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme à sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions également d'en
> avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
> erreur ou virus.
> 
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> purpose, any dissemination or disclosure, either whole or partial, is
> prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use
> any part of it. If you have received this message in error, please delete
> it and all copies from your system and notify the sender immediately by
> return message.
>
> E-mail communication cannot be guaranteed to be timely secure, error or
> virus-free.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


[petsc-users] Undestanding how to increase the overlap

2024-01-18 Thread TARDIEU Nicolas via petsc-users
Dear PETSc Team,

I am trying to understand how to increase the overlap of a matrix.
I wrote the attached petsc4py script where I build a simple matrix and play 
with the increaseOverlap method. Unfortunately, before and after the call, 
nothing changes in the index set. FYI, I have tried to mimic 
src/ksp/ksp/tutorials/ex82.c
 line 72:76.
Here is how I run the script : "mpiexec -n 2 python test_overlap.py"

Could you please indicate what I am missing ?

Regards,
Nicolas
--
Nicolas Tardieu
Ing PhD Computational Mechanics
EDF - R Dpt ERMES
PARIS-SACLAY, FRANCE



Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.

import petsc4py
# petsc4py.init(['-info'])
from petsc4py.PETSc import KSP, IS, Mat, Options, Viewer, PC, Vec, NullSpace, MatPartitioning, COMM_SELF


NEQ = 6

# Simple parallel diagonal matrix
M = Mat().create()
M.setSizes([NEQ, NEQ])
M.setType('aij')
M.setUp()
M.setPreallocationNNZ(1)
cs, ce = M.getOwnershipRange()
for row in range(cs, ce):
M.setValue(row, row, row)
M.assemble()
M.view()


ovl=2
# reproduce initial layout
isovl = IS().createStride(ce-cs, cs, 1, comm=COMM_SELF)
loc = isovl.getLocalSize()
print(f"locSize before = {loc}", flush=True)
isovl.view()

# increase overlap
M.increaseOverlap(isovl, ovl)
loc = isovl.getLocalSize()
print(f"locSize after = {loc}", flush=True)
isovl.view()



[petsc-users] HashMap Error when populating AIJCUSPARSE matrix

2024-01-18 Thread Yesypenko, Anna
Dear Petsc users/developers,

I'm experiencing a bug when using petsc4py with GPU support. It may be my 
mistake in how I set up a AIJCUSPARSE matrix.
For larger matrices, I sometimes encounter a error in assigning matrix values; 
the error is thrown in PetscHMapIJVQuerySet().
Here is a minimum snippet that populates a sparse tridiagonal matrix.

```
from petsc4py import PETSc
from scipy.sparse import diags
import numpy as np

n = int(5e5);

nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
A = PETSc.Mat(comm=PETSc.COMM_WORLD)
A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
A.setType('aijcusparse')
tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
### this is the line where the error is thrown.
A.assemble()
```

The error trace is below:
```
File "petsc4py/PETSc/Mat.pyx", line 2603, in petsc4py.PETSc.Mat.setValuesCSR
  File "petsc4py/PETSc/petscmat.pxi", line 1039, in 
petsc4py.PETSc.matsetvalues_csr
  File "petsc4py/PETSc/petscmat.pxi", line 1032, in 
petsc4py.PETSc.matsetvalues_ijv
petsc4py.PETSc.Error: error code 76
[0] MatSetValues() at 
/work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
[0] MatSetValues_Seq_Hash() at 
/work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
[0] PetscHMapIJVQuerySet() at 
/work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
[0] Error in external library
[0] [khash] Assertion: `ret >= 0' failed.
```

If I run the same script a handful of times, it will run without errors 
eventually.
Does anyone have insight on why it is behaving this way? I'm running on a node 
with 3x NVIDIA A100 PCIE 40GB.

Thank you!
Anna


Re: [petsc-users] MatAssemblyBegin freezes during MPI communication

2024-01-18 Thread Junchao Zhang
On Thu, Jan 18, 2024 at 1:47 AM 袁煕  wrote:

> Dear PETSc Experts,
>
> My FEM program works well generally, but in some specific cases with
> multiple CPUs are used, it freezes when calling MatAssemblyBegin where
> PMPI_Allreduce is called (see attached file).
>
> After some investigation, I found that it is most probably due to
>
> ・ MatSetValue is not called from all CPUs before MatAssemblyBegin
>
> For example, when 4 CPUs are used, if there are elements in CPU 0,1,2 but
> no elements in CPU 3, then all CPUs other than CPU 3 would call
> MatSetValue  function. I want to know
>
> 1. If my conjecture could be right? And If so
>
No.  All processes do MPI_Allreduce to know if there are incoming values
set by others.  To know why hanging, you can attach gdb to all MPI
processes to see where they are.

>
>
2. Are there any convenient means to avoid this problem?
>
> Thanks,
> Xi YUAN, PhD Solid Mechanics
>


Re: [petsc-users] MatAssemblyBegin freezes during MPI communication

2024-01-18 Thread Matthew Knepley
On Thu, Jan 18, 2024 at 2:47 AM 袁煕  wrote:

> Dear PETSc Experts,
>
> My FEM program works well generally, but in some specific cases with
> multiple CPUs are used, it freezes when calling MatAssemblyBegin where
> PMPI_Allreduce is called (see attached file).
>
> After some investigation, I found that it is most probably due to
>
> ・ MatSetValue is not called from all CPUs before MatAssemblyBegin
>
> For example, when 4 CPUs are used, if there are elements in CPU 0,1,2 but
> no elements in CPU 3, then all CPUs other than CPU 3 would call
> MatSetValue  function. I want to know
>
> 1. If my conjecture could be right? And If so
>

No, you do not have to call MatSetValue() from all processes.


> 2. Are there any convenient means to avoid this problem?
>

Are you calling MatAssemblyBegin() from all processes? This is necessary.

  Thanks,

 Matt


> Thanks,
> Xi YUAN, PhD Solid Mechanics
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/