It’s indeed very suspicious (to me) that we are using rmap to change a column 
index.
Switching to cmap gets your code running, but I’ll need to see if this triggers 
regressions.

Thanks for the report,
Pierre

diff --git a/src/mat/impls/aij/mpi/mpiov.c b/src/mat/impls/aij/mpi/mpiov.c
index d1037d7d817..051981ebe9a 100644
--- a/src/mat/impls/aij/mpi/mpiov.c
+++ b/src/mat/impls/aij/mpi/mpiov.c
@@ -2948,3 +2948,3 @@ PetscErrorCode MatSetSeqMats_MPIAIJ(Mat C, IS rowemb, IS 
dcolemb, IS ocolemb, Ma
 
-    PetscCall(PetscLayoutGetRange(C->rmap, &rstart, &rend));
+    PetscCall(PetscLayoutGetRange(C->cmap, &rstart, &rend));
     shift      = rend - rstart;

$ cat proc_0_output.txt
rstart 0 rend 4
Mat Object: 3 MPI processes
  type: mpiaij
  row 0:   (0, 101.)    (3, 104.)    (6, 107.)    (9, 110.)   
  row 1:   (2, 203.)    (5, 206.)    (8, 209.)    (11, 212.)   
  row 2:   (1, 302.)    (4, 305.)    (7, 308.)    (10, 311.)   
  row 3:   (0, 401.)    (3, 404.)    (6, 407.)    (9, 410.)   
  row 4:   (2, 503.)    (5, 506.)    (8, 509.)    (11, 512.)   
  row 5:   (1, 602.)    (4, 605.)    (7, 608.)    (10, 611.)   
  row 6:   (0, 701.)    (3, 704.)    (6, 707.)    (9, 710.)   
  row 7:   (2, 803.)    (5, 806.)    (8, 809.)    (11, 812.)   
  row 8:   (1, 902.)    (4, 905.)    (7, 908.)    (10, 911.)   
  row 9:   (0, 1001.)    (3, 1004.)    (6, 1007.)    (9, 1010.)   
  row 10:   (2, 1103.)    (5, 1106.)    (8, 1109.)    (11, 1112.)   
  row 11:   (1, 1202.)    (4, 1205.)    (7, 1208.)    (10, 1211.)   
idxr proc
IS Object: 2 MPI processes
  type: general
[0] Number of indices in set 4
[0] 0 0
[0] 1 1
[0] 2 2
[0] 3 3
[1] Number of indices in set 4
[1] 0 4
[1] 1 5
[1] 2 6
[1] 3 7
idxc proc
IS Object: 2 MPI processes
  type: general
[0] Number of indices in set 2
[0] 0 0
[0] 1 1
[1] Number of indices in set 2
[1] 0 6
[1] 1 7
Mat Object: 2 MPI processes
  type: mpiaij
  row 0:   (0, 101.)    (2, 107.)   
  row 1:  
  row 2:   (1, 302.)    (3, 308.)   
  row 3:   (0, 401.)    (2, 407.)   
  row 4:  
  row 5:   (1, 602.)    (3, 608.)   
  row 6:   (0, 701.)    (2, 707.)   
  row 7:  
rstart 0 rend 4
local row 0: ( 0 , 1.010000e+02) ( 2 , 1.070000e+02)
local row 1:
local row 2: ( 1 , 3.020000e+02) ( 3 , 3.080000e+02)
local row 3: ( 0 , 4.010000e+02) ( 2 , 4.070000e+02)

> On 26 Aug 2025, at 3:18 PM, Pierre Jolivet <pie...@joliv.et> wrote:
> 
> 
>> On 26 Aug 2025, at 12:50 PM, Alexis SALZMAN <alexis.salz...@ec-nantes.fr> 
>> wrote:
>> 
>> Mark, you were right and I was wrong about the dense matrix. Adding explicit 
>> zeros to the distributed matrix used to extract the sub-matrices (making it 
>> dense) in my test does not change the behaviour: there is still an error.
>> 
>> I am finding it increasingly difficult to understand the logic of the row 
>> and column 'IS' creation. I ran many tests to achieve the desired result: a 
>> rectangular sub-matrix (so a rectangular or square sub-matrix appears to be 
>> possible). However, many others resulted in the same kind of error.
>> 
> This may be a PETSc bug in MatSetSeqMats_MPIAIJ().
> -> 2965               PetscCall(MatSetValues(aij->B, 1, &row, 1, &col, &v, 
> INSERT_VALUES));
> col has a value of 4, which doesn’t make sense since the output Mat has 4 
> columns (thus, has the error message suggests, the value should be lower than 
> or equal to 3).
> 
> Thanks,
> Pierre
>> From what I observed, the test only works if the column selection 
>> contribution (size_c in the test) has a specific value related to the row 
>> selection contribution (size_r in the test) for proc 0 (rank for both 
>> communicator and sub-communicator):
>> 
>> if size_r==2 then if size_c<=2 it works.
>> if size_r>=3 and size_r<=5 then size_c==size_r is the only working case.
>> This occurs "regardless" of what is requested in proc 1 and in selr/selc (It 
>> can't be a dummy setting, though). In any case, it's certainly not an 
>> exhaustive analysis.
>> 
>> Many thanks to anyone who can explain to me the logic behind the 
>> construction of row and column 'IS'.
>> 
>> Regards
>> 
>> A.S.
>> 
>> 
>> 
>> Le 25/08/2025 à 20:00, Alexis SALZMAN a écrit :
>>> Thanks Mark for your attention.
>>> 
>>> The uncleaned error message, compared to my post in July, is as follows:
>>> 
>>> [0]PETSC ERROR: --------------------- Error Message 
>>> -------------------------------------------------------------- 
>>> [0]PETSC ERROR: Argument out of range 
>>> [0]PETSC ERROR: Column too large: col 4 max 3 
>>> [0]PETSC ERROR: See 
>>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!fO8_CzT5tLNg4AugvsuMRWr-Lw8VlCt0VQgyHJ7BNmCJijU6SQdg9V02uTCc7FhyO65Ks2UyTNAqbbamtSViDQ$
>>>   
>>> <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dWBkCu100EMuxu8ooVUnqSFN7OhzOBoNHAiwDYEQ5cJ921sU5hdFb-G24ounZFeUQgZkfWqGRX4iIHyQ-xLQElJst5RbKa2pGnk$>
>>>  for trouble shooting. 
>>> [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown  
>>> [0]PETSC ERROR: subnb with 3 MPI process(es) and PETSC_ARCH  on 
>>> pc-str97.ec-nantes.fr by salzman Mon Aug 25 19:11:37 2025 
>>> [0]PETSC ERROR: Configure options: PETSC_ARCH=real_fc41_Release_gcc_i4 
>>> PETSC_DIR=/home/salzman/devel/ExternalLib/build/PETSC/petsc --doCleanup=1 
>>> --with-scalar-type=real --known-level1-dcach
>>> e-linesize=64 --with-cc=gcc --CFLAGS="-fPIC " --CC_LINKER_FLAGS=-fopenmp 
>>> --with-cxx=g++ --with-cxx-dialect=c++20 --CXXFLAGS="-fPIC " 
>>> --CXX_LINKER_FLAGS=-fopenmp --with-fc=gfortran --FFLAGS=
>>> "-fPIC " --FC_LINKER_FLAGS=-fopenmp --with-debugging=0 
>>> --with-fortran-bindings=0 --with-fortran-kernels=1 --with-mpi-compilers=0 
>>> --with-mpi-include=/usr/include/openmpi-x86_64 --with-mpi-li
>>> b="[/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi_mpifh.so]"
>>>  
>>> --with-blas-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/i
>>> ntel/oneapi/mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]"
>>>  
>>> --with-lapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/intel/oneapi
>>> /mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]"
>>>  --with-mumps=1 --with-mumps-include=/home/salzman/local/i4_gcc/include 
>>> --with-mumps-lib="[/home/salzma
>>> n/local/i4_gcc/lib/libdmumps.so,/home/salzman/local/i4_gcc/lib/libmumps_common.so,/home/salzman/local/i4_gcc/lib/libpord.so]"
>>>  --with-scalapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_
>>> scalapack_lp64.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_blacs_openmpi_lp64.so]"
>>>  --with-mkl_pardiso=1 
>>> --with-mkl_pardiso-include=/opt/intel/oneapi/mkl/latest/include 
>>> --with-mkl_pardiso-lib
>>> ="[/opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_lp64.so]" 
>>> --with-hdf5=1 --with-hdf5-include=/usr/include/openmpi-x86_64 
>>> --with-hdf5-lib="[/usr/lib64/openmpi/lib/libhdf5.so]" --with
>>> -pastix=0 --download-pastix=no --with-hwloc=1 
>>> --with-hwloc-dir=/home/salzman/local/i4_gcc --download-hwloc=no 
>>> --with-ptscotch-include=/home/salzman/local/i4_gcc/include 
>>> --with-ptscotch-lib=
>>> "[/home/salzman/local/i4_gcc/lib/libptscotch.a,/home/salzman/local/i4_gcc/lib/libptscotcherr.a,/home/salzman/local/i4_gcc/lib/libptscotcherrexit.a,/home/salzman/local/i4_gcc/lib/libscotch.a
>>> ,/home/salzman/local/i4_gcc/lib/libscotcherr.a,/home/salzman/local/i4_gcc/lib/libscotcherrexit.a]"
>>>  --with-hypre=1 --download-hypre=yes --with-suitesparse=1 
>>> --with-suitesparse-include=/home/
>>> salzman/local/i4_gcc/include 
>>> --with-suitesparse-lib="[/home/salzman/local/i4_gcc/lib/libsuitesparseconfig.so,/home/salzman/local/i4_gcc/lib/libumfpack.so,/home/salzman/local/i4_gcc/lib/libk
>>> lu.so,/home/salzman/local/i4_gcc/lib/libcholmod.so,/home/salzman/local/i4_gcc/lib/libspqr.so,/home/salzman/local/i4_gcc/lib/libcolamd.so,/home/salzman/local/i4_gcc/lib/libccolamd.so,/home/s
>>> alzman/local/i4_gcc/lib/libcamd.so,/home/salzman/local/i4_gcc/lib/libamd.so,/home/salzman/local/i4_gcc/lib/libmetis.so]"
>>>  --download-suitesparse=no --with-python-exec=python3.12 --have-numpy
>>> =1 ---with-petsc4py=1 ---with-petsc4py-test-np=4 ---with-mpi4py=1 
>>> --prefix=/home/salzman/local/i4_gcc/real_arithmetic COPTFLAGS="-O3 -g " 
>>> CXXOPTFLAGS="-O3 -g " FOPTFLAGS="-O3 -g " 
>>> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at 
>>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/seq/aij.c:426 
>>> [0]PETSC ERROR: #2 MatSetValues() at 
>>> /home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:1543 
>>> [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at 
>>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:2965 
>>> [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at 
>>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3163 
>>> [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at 
>>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3196 
>>> [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at 
>>> /home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:7293 
>>> [0]PETSC ERROR: #7 main() at subnb.c:181 
>>> [0]PETSC ERROR: No PETSc Option Table entries 
>>> [0]PETSC ERROR: ----------------End of Error Message -------send entire 
>>> error message to petsc-ma...@mcs.anl.gov 
>>> <mailto:petsc-ma...@mcs.anl.gov>---------- 
>>> --------------------------------------------------------------------------
>>> 
>>> This message comes from executing the attached test (I simplified the test 
>>> by removing the block size from the matrix used for extraction, compared to 
>>> the July test). In proc_xx_output.txt, you will find the output from the 
>>> code execution with the -ok option (i.e. irow/idxr and icol/idxc are the 
>>> same, i.e. a square sub-block for colour 0 distributed across the first two 
>>> processes).
>>> 
>>> Has expected in this case we obtain the 0,3,6,9 sub-block terms, which are 
>>> distributed across processes 0 and 1 (two rows per proc).
>>> 
>>> When asking for rectangular sub-block (i.e. with no option) it crash with 
>>> column to large on process 0: 4 col max 3 ??? I ask for 4 rows and 2 
>>> columns in this process ???
>>> 
>>> Otherwise, I mention the dense aspect of the matrix in ex183.c, because, in 
>>> this case, no matter what selection is requested, all terms are non-null. 
>>> If there is an issue with the way the selection is coded in the user 
>>> program, I think it will be masked thanks to the full graph representation. 
>>> However, this may not be the case — I should test it.
>>> 
>>> I'll take a look at ex23.c.
>>> 
>>> Thanks,
>>> 
>>> A.S.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Le 25/08/2025 à 17:55, Mark Adams a écrit :
>>>> Ah, OK, never say never.
>>>> 
>>>> MatCreateSubMatrices seems to support creating a new matrix with the 
>>>> communicator of the IS.
>>>> It just needs to read from the input matrix and does not use it for 
>>>> communication, so it can do that.
>>>> 
>>>> As far as rectangular matrices, there is no reason not to support that 
>>>> (the row IS and column IS can be distinct).
>>>> Can you send the whole error message?
>>>> There may not be a test that does this, but src/mat/tests/ex23.c looks 
>>>> like it may be a rectangular matrix output.
>>>> 
>>>> And, it should not matter if the input matrix has a 100% full sparse 
>>>> matrix. It is still MatAIJ.
>>>> The semantics and API is the same for sparse or dense matrices.
>>>> 
>>>> Thanks,
>>>> Mark
>>>> 
>>>> On Mon, Aug 25, 2025 at 7:31 AM Alexis SALZMAN 
>>>> <alexis.salz...@ec-nantes.fr <mailto:alexis.salz...@ec-nantes.fr>> wrote:
>>>>> Hi,
>>>>> 
>>>>> Thanks for your answer, Mark. Perhaps MatCreateSubMatricesMPI is the only 
>>>>> PETSc function that acts on a sub-communicator — I'm not sure — but it's 
>>>>> clear that there's no ambiguity on that point. The first line of the 
>>>>> documentation for that function states that it 'may live on subcomms'. 
>>>>> This is confirmed by the 'src/mat/tests/ex183.c' test case. I used this 
>>>>> test case to understand the function, which helped me with my code and 
>>>>> the example I provided in my initial post. Unfortunately, in this 
>>>>> example, the matrix from which the sub-matrices are extracted is dense, 
>>>>> even though it uses a sparse structure. This does not clarify how to 
>>>>> define sub-matrices when extracting from a sparse distributed matrix. 
>>>>> Since my initial post, I have discovered that having more columns than 
>>>>> rows can also result in the same error message.
>>>>> So, my questions boil down to:
>>>>> 
>>>>> Can MatCreateSubMatricesMPI extract rectangular matrices from a square 
>>>>> distributed sparse matrix?
>>>>> 
>>>>> If not, the fact that only square matrices can be extracted in this 
>>>>> context should perhaps be mentioned in the documentation.
>>>>> 
>>>>> If so, I would be very grateful for any assistance in defining an IS pair 
>>>>> in this context.
>>>>> 
>>>>> Regards
>>>>> 
>>>>> A.S.
>>>>> 
>>>>> Le 27/07/2025 à 00:15, Mark Adams a écrit :
>>>>>> First, you can not mix communicators in PETSc calls in general (ever?), 
>>>>>> but this error looks like you might be asking for a row from the matrix 
>>>>>> that does not exist.
>>>>>> You should start with a PETSc example code. Test it and modify it to 
>>>>>> suit your needs.
>>>>>> 
>>>>>> Good luck,
>>>>>> Mark
>>>>>> 
>>>>>> On Fri, Jul 25, 2025 at 9:31 AM Alexis SALZMAN 
>>>>>> <alexis.salz...@ec-nantes.fr <mailto:alexis.salz...@ec-nantes.fr>> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> As I am relatively new to Petsc, I may have misunderstood how to use 
>>>>>>> the 
>>>>>>> MatCreateSubMatricesMPI function. The attached code is tuned for three 
>>>>>>> processes and extracts one matrix for each colour of a subcommunicator 
>>>>>>> that has been created using the MPI_Comm_split function from an  MPIAij 
>>>>>>> matrix. The following error message appears when the code is set to its 
>>>>>>> default configuration (i.e. when a rectangular matrix is extracted with 
>>>>>>> more rows than columns for colour 0):
>>>>>>> 
>>>>>>> [0]PETSC ERROR: --------------------- Error Message 
>>>>>>> --------------------------------------------------------------
>>>>>>> [0]PETSC ERROR: Argument out of range
>>>>>>> [0]PETSC ERROR: Column too large: col 4 max 3
>>>>>>> [0]PETSC ERROR: See 
>>>>>>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZqH097BZ0G0O3WI7RWrwIKFNpyk0czSWEqfusAeTlgEygAffwpgBUzsLw1TIoGkjZ3mYG-NRQxxFoxU4y8EyY0ofiz9I43Qwe0w$
>>>>>>>   for trouble shooting.
>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown
>>>>>>> 
>>>>>>> ... petsc git hash 2a89477b25f compiled on a dell i9 computer with Gcc 
>>>>>>> 14.3, mkl 2025.2, .....
>>>>>>> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at 
>>>>>>> ...petsc/src/mat/impls/aij/seq/aij.c:426
>>>>>>> [0]PETSC ERROR: #2 MatSetValues() at 
>>>>>>> ...petsc/src/mat/interface/matrix.c:1543
>>>>>>> [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at 
>>>>>>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:2965
>>>>>>> [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at 
>>>>>>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:3163
>>>>>>> [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at 
>>>>>>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:3196
>>>>>>> [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at 
>>>>>>> .../petsc/src/mat/interface/matrix.c:7293
>>>>>>> [0]PETSC ERROR: #7 main() at sub.c:169
>>>>>>> 
>>>>>>> When the '-ok' option is selected, the code extracts a square matrix 
>>>>>>> for 
>>>>>>> colour 0, which runs smoothly in this case. Selecting the '-trans' 
>>>>>>> option swaps the row and column selection indices, providing a 
>>>>>>> transposed submatrix smoothly. For colour 1, which uses only one 
>>>>>>> process 
>>>>>>> and is therefore sequential, rectangular extraction is OK regardless of 
>>>>>>> the shape.
>>>>>>> 
>>>>>>> Is this dependency on the shape expected? Have I missed an important 
>>>>>>> tuning step somewhere?
>>>>>>> 
>>>>>>> Thank you in advance for any clarification.
>>>>>>> 
>>>>>>> Regards
>>>>>>> 
>>>>>>> A.S.
>>>>>>> 
>>>>>>> P.S.: I'm sorry, but as I'm leaving my office for the following weeks 
>>>>>>> this evening, I won't be very responsive during this period.
>>>>>>> 
>>>>>>> 
> 

Reply via email to