It’s indeed very suspicious (to me) that we are using rmap to change a column index. Switching to cmap gets your code running, but I’ll need to see if this triggers regressions.
Thanks for the report, Pierre diff --git a/src/mat/impls/aij/mpi/mpiov.c b/src/mat/impls/aij/mpi/mpiov.c index d1037d7d817..051981ebe9a 100644 --- a/src/mat/impls/aij/mpi/mpiov.c +++ b/src/mat/impls/aij/mpi/mpiov.c @@ -2948,3 +2948,3 @@ PetscErrorCode MatSetSeqMats_MPIAIJ(Mat C, IS rowemb, IS dcolemb, IS ocolemb, Ma - PetscCall(PetscLayoutGetRange(C->rmap, &rstart, &rend)); + PetscCall(PetscLayoutGetRange(C->cmap, &rstart, &rend)); shift = rend - rstart; $ cat proc_0_output.txt rstart 0 rend 4 Mat Object: 3 MPI processes type: mpiaij row 0: (0, 101.) (3, 104.) (6, 107.) (9, 110.) row 1: (2, 203.) (5, 206.) (8, 209.) (11, 212.) row 2: (1, 302.) (4, 305.) (7, 308.) (10, 311.) row 3: (0, 401.) (3, 404.) (6, 407.) (9, 410.) row 4: (2, 503.) (5, 506.) (8, 509.) (11, 512.) row 5: (1, 602.) (4, 605.) (7, 608.) (10, 611.) row 6: (0, 701.) (3, 704.) (6, 707.) (9, 710.) row 7: (2, 803.) (5, 806.) (8, 809.) (11, 812.) row 8: (1, 902.) (4, 905.) (7, 908.) (10, 911.) row 9: (0, 1001.) (3, 1004.) (6, 1007.) (9, 1010.) row 10: (2, 1103.) (5, 1106.) (8, 1109.) (11, 1112.) row 11: (1, 1202.) (4, 1205.) (7, 1208.) (10, 1211.) idxr proc IS Object: 2 MPI processes type: general [0] Number of indices in set 4 [0] 0 0 [0] 1 1 [0] 2 2 [0] 3 3 [1] Number of indices in set 4 [1] 0 4 [1] 1 5 [1] 2 6 [1] 3 7 idxc proc IS Object: 2 MPI processes type: general [0] Number of indices in set 2 [0] 0 0 [0] 1 1 [1] Number of indices in set 2 [1] 0 6 [1] 1 7 Mat Object: 2 MPI processes type: mpiaij row 0: (0, 101.) (2, 107.) row 1: row 2: (1, 302.) (3, 308.) row 3: (0, 401.) (2, 407.) row 4: row 5: (1, 602.) (3, 608.) row 6: (0, 701.) (2, 707.) row 7: rstart 0 rend 4 local row 0: ( 0 , 1.010000e+02) ( 2 , 1.070000e+02) local row 1: local row 2: ( 1 , 3.020000e+02) ( 3 , 3.080000e+02) local row 3: ( 0 , 4.010000e+02) ( 2 , 4.070000e+02) > On 26 Aug 2025, at 3:18 PM, Pierre Jolivet <pie...@joliv.et> wrote: > > >> On 26 Aug 2025, at 12:50 PM, Alexis SALZMAN <alexis.salz...@ec-nantes.fr> >> wrote: >> >> Mark, you were right and I was wrong about the dense matrix. Adding explicit >> zeros to the distributed matrix used to extract the sub-matrices (making it >> dense) in my test does not change the behaviour: there is still an error. >> >> I am finding it increasingly difficult to understand the logic of the row >> and column 'IS' creation. I ran many tests to achieve the desired result: a >> rectangular sub-matrix (so a rectangular or square sub-matrix appears to be >> possible). However, many others resulted in the same kind of error. >> > This may be a PETSc bug in MatSetSeqMats_MPIAIJ(). > -> 2965 PetscCall(MatSetValues(aij->B, 1, &row, 1, &col, &v, > INSERT_VALUES)); > col has a value of 4, which doesn’t make sense since the output Mat has 4 > columns (thus, has the error message suggests, the value should be lower than > or equal to 3). > > Thanks, > Pierre >> From what I observed, the test only works if the column selection >> contribution (size_c in the test) has a specific value related to the row >> selection contribution (size_r in the test) for proc 0 (rank for both >> communicator and sub-communicator): >> >> if size_r==2 then if size_c<=2 it works. >> if size_r>=3 and size_r<=5 then size_c==size_r is the only working case. >> This occurs "regardless" of what is requested in proc 1 and in selr/selc (It >> can't be a dummy setting, though). In any case, it's certainly not an >> exhaustive analysis. >> >> Many thanks to anyone who can explain to me the logic behind the >> construction of row and column 'IS'. >> >> Regards >> >> A.S. >> >> >> >> Le 25/08/2025 à 20:00, Alexis SALZMAN a écrit : >>> Thanks Mark for your attention. >>> >>> The uncleaned error message, compared to my post in July, is as follows: >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [0]PETSC ERROR: Argument out of range >>> [0]PETSC ERROR: Column too large: col 4 max 3 >>> [0]PETSC ERROR: See >>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!fO8_CzT5tLNg4AugvsuMRWr-Lw8VlCt0VQgyHJ7BNmCJijU6SQdg9V02uTCc7FhyO65Ks2UyTNAqbbamtSViDQ$ >>> >>> <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dWBkCu100EMuxu8ooVUnqSFN7OhzOBoNHAiwDYEQ5cJ921sU5hdFb-G24ounZFeUQgZkfWqGRX4iIHyQ-xLQElJst5RbKa2pGnk$> >>> for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown >>> [0]PETSC ERROR: subnb with 3 MPI process(es) and PETSC_ARCH on >>> pc-str97.ec-nantes.fr by salzman Mon Aug 25 19:11:37 2025 >>> [0]PETSC ERROR: Configure options: PETSC_ARCH=real_fc41_Release_gcc_i4 >>> PETSC_DIR=/home/salzman/devel/ExternalLib/build/PETSC/petsc --doCleanup=1 >>> --with-scalar-type=real --known-level1-dcach >>> e-linesize=64 --with-cc=gcc --CFLAGS="-fPIC " --CC_LINKER_FLAGS=-fopenmp >>> --with-cxx=g++ --with-cxx-dialect=c++20 --CXXFLAGS="-fPIC " >>> --CXX_LINKER_FLAGS=-fopenmp --with-fc=gfortran --FFLAGS= >>> "-fPIC " --FC_LINKER_FLAGS=-fopenmp --with-debugging=0 >>> --with-fortran-bindings=0 --with-fortran-kernels=1 --with-mpi-compilers=0 >>> --with-mpi-include=/usr/include/openmpi-x86_64 --with-mpi-li >>> b="[/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi_mpifh.so]" >>> >>> --with-blas-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/i >>> ntel/oneapi/mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]" >>> >>> --with-lapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/intel/oneapi >>> /mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]" >>> --with-mumps=1 --with-mumps-include=/home/salzman/local/i4_gcc/include >>> --with-mumps-lib="[/home/salzma >>> n/local/i4_gcc/lib/libdmumps.so,/home/salzman/local/i4_gcc/lib/libmumps_common.so,/home/salzman/local/i4_gcc/lib/libpord.so]" >>> --with-scalapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_ >>> scalapack_lp64.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_blacs_openmpi_lp64.so]" >>> --with-mkl_pardiso=1 >>> --with-mkl_pardiso-include=/opt/intel/oneapi/mkl/latest/include >>> --with-mkl_pardiso-lib >>> ="[/opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_lp64.so]" >>> --with-hdf5=1 --with-hdf5-include=/usr/include/openmpi-x86_64 >>> --with-hdf5-lib="[/usr/lib64/openmpi/lib/libhdf5.so]" --with >>> -pastix=0 --download-pastix=no --with-hwloc=1 >>> --with-hwloc-dir=/home/salzman/local/i4_gcc --download-hwloc=no >>> --with-ptscotch-include=/home/salzman/local/i4_gcc/include >>> --with-ptscotch-lib= >>> "[/home/salzman/local/i4_gcc/lib/libptscotch.a,/home/salzman/local/i4_gcc/lib/libptscotcherr.a,/home/salzman/local/i4_gcc/lib/libptscotcherrexit.a,/home/salzman/local/i4_gcc/lib/libscotch.a >>> ,/home/salzman/local/i4_gcc/lib/libscotcherr.a,/home/salzman/local/i4_gcc/lib/libscotcherrexit.a]" >>> --with-hypre=1 --download-hypre=yes --with-suitesparse=1 >>> --with-suitesparse-include=/home/ >>> salzman/local/i4_gcc/include >>> --with-suitesparse-lib="[/home/salzman/local/i4_gcc/lib/libsuitesparseconfig.so,/home/salzman/local/i4_gcc/lib/libumfpack.so,/home/salzman/local/i4_gcc/lib/libk >>> lu.so,/home/salzman/local/i4_gcc/lib/libcholmod.so,/home/salzman/local/i4_gcc/lib/libspqr.so,/home/salzman/local/i4_gcc/lib/libcolamd.so,/home/salzman/local/i4_gcc/lib/libccolamd.so,/home/s >>> alzman/local/i4_gcc/lib/libcamd.so,/home/salzman/local/i4_gcc/lib/libamd.so,/home/salzman/local/i4_gcc/lib/libmetis.so]" >>> --download-suitesparse=no --with-python-exec=python3.12 --have-numpy >>> =1 ---with-petsc4py=1 ---with-petsc4py-test-np=4 ---with-mpi4py=1 >>> --prefix=/home/salzman/local/i4_gcc/real_arithmetic COPTFLAGS="-O3 -g " >>> CXXOPTFLAGS="-O3 -g " FOPTFLAGS="-O3 -g " >>> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at >>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/seq/aij.c:426 >>> [0]PETSC ERROR: #2 MatSetValues() at >>> /home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:1543 >>> [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at >>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:2965 >>> [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at >>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3163 >>> [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at >>> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3196 >>> [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at >>> /home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:7293 >>> [0]PETSC ERROR: #7 main() at subnb.c:181 >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>> error message to petsc-ma...@mcs.anl.gov >>> <mailto:petsc-ma...@mcs.anl.gov>---------- >>> -------------------------------------------------------------------------- >>> >>> This message comes from executing the attached test (I simplified the test >>> by removing the block size from the matrix used for extraction, compared to >>> the July test). In proc_xx_output.txt, you will find the output from the >>> code execution with the -ok option (i.e. irow/idxr and icol/idxc are the >>> same, i.e. a square sub-block for colour 0 distributed across the first two >>> processes). >>> >>> Has expected in this case we obtain the 0,3,6,9 sub-block terms, which are >>> distributed across processes 0 and 1 (two rows per proc). >>> >>> When asking for rectangular sub-block (i.e. with no option) it crash with >>> column to large on process 0: 4 col max 3 ??? I ask for 4 rows and 2 >>> columns in this process ??? >>> >>> Otherwise, I mention the dense aspect of the matrix in ex183.c, because, in >>> this case, no matter what selection is requested, all terms are non-null. >>> If there is an issue with the way the selection is coded in the user >>> program, I think it will be masked thanks to the full graph representation. >>> However, this may not be the case — I should test it. >>> >>> I'll take a look at ex23.c. >>> >>> Thanks, >>> >>> A.S. >>> >>> >>> >>> >>> >>> Le 25/08/2025 à 17:55, Mark Adams a écrit : >>>> Ah, OK, never say never. >>>> >>>> MatCreateSubMatrices seems to support creating a new matrix with the >>>> communicator of the IS. >>>> It just needs to read from the input matrix and does not use it for >>>> communication, so it can do that. >>>> >>>> As far as rectangular matrices, there is no reason not to support that >>>> (the row IS and column IS can be distinct). >>>> Can you send the whole error message? >>>> There may not be a test that does this, but src/mat/tests/ex23.c looks >>>> like it may be a rectangular matrix output. >>>> >>>> And, it should not matter if the input matrix has a 100% full sparse >>>> matrix. It is still MatAIJ. >>>> The semantics and API is the same for sparse or dense matrices. >>>> >>>> Thanks, >>>> Mark >>>> >>>> On Mon, Aug 25, 2025 at 7:31 AM Alexis SALZMAN >>>> <alexis.salz...@ec-nantes.fr <mailto:alexis.salz...@ec-nantes.fr>> wrote: >>>>> Hi, >>>>> >>>>> Thanks for your answer, Mark. Perhaps MatCreateSubMatricesMPI is the only >>>>> PETSc function that acts on a sub-communicator — I'm not sure — but it's >>>>> clear that there's no ambiguity on that point. The first line of the >>>>> documentation for that function states that it 'may live on subcomms'. >>>>> This is confirmed by the 'src/mat/tests/ex183.c' test case. I used this >>>>> test case to understand the function, which helped me with my code and >>>>> the example I provided in my initial post. Unfortunately, in this >>>>> example, the matrix from which the sub-matrices are extracted is dense, >>>>> even though it uses a sparse structure. This does not clarify how to >>>>> define sub-matrices when extracting from a sparse distributed matrix. >>>>> Since my initial post, I have discovered that having more columns than >>>>> rows can also result in the same error message. >>>>> So, my questions boil down to: >>>>> >>>>> Can MatCreateSubMatricesMPI extract rectangular matrices from a square >>>>> distributed sparse matrix? >>>>> >>>>> If not, the fact that only square matrices can be extracted in this >>>>> context should perhaps be mentioned in the documentation. >>>>> >>>>> If so, I would be very grateful for any assistance in defining an IS pair >>>>> in this context. >>>>> >>>>> Regards >>>>> >>>>> A.S. >>>>> >>>>> Le 27/07/2025 à 00:15, Mark Adams a écrit : >>>>>> First, you can not mix communicators in PETSc calls in general (ever?), >>>>>> but this error looks like you might be asking for a row from the matrix >>>>>> that does not exist. >>>>>> You should start with a PETSc example code. Test it and modify it to >>>>>> suit your needs. >>>>>> >>>>>> Good luck, >>>>>> Mark >>>>>> >>>>>> On Fri, Jul 25, 2025 at 9:31 AM Alexis SALZMAN >>>>>> <alexis.salz...@ec-nantes.fr <mailto:alexis.salz...@ec-nantes.fr>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> As I am relatively new to Petsc, I may have misunderstood how to use >>>>>>> the >>>>>>> MatCreateSubMatricesMPI function. The attached code is tuned for three >>>>>>> processes and extracts one matrix for each colour of a subcommunicator >>>>>>> that has been created using the MPI_Comm_split function from an MPIAij >>>>>>> matrix. The following error message appears when the code is set to its >>>>>>> default configuration (i.e. when a rectangular matrix is extracted with >>>>>>> more rows than columns for colour 0): >>>>>>> >>>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>>> -------------------------------------------------------------- >>>>>>> [0]PETSC ERROR: Argument out of range >>>>>>> [0]PETSC ERROR: Column too large: col 4 max 3 >>>>>>> [0]PETSC ERROR: See >>>>>>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZqH097BZ0G0O3WI7RWrwIKFNpyk0czSWEqfusAeTlgEygAffwpgBUzsLw1TIoGkjZ3mYG-NRQxxFoxU4y8EyY0ofiz9I43Qwe0w$ >>>>>>> for trouble shooting. >>>>>>> [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown >>>>>>> >>>>>>> ... petsc git hash 2a89477b25f compiled on a dell i9 computer with Gcc >>>>>>> 14.3, mkl 2025.2, ..... >>>>>>> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at >>>>>>> ...petsc/src/mat/impls/aij/seq/aij.c:426 >>>>>>> [0]PETSC ERROR: #2 MatSetValues() at >>>>>>> ...petsc/src/mat/interface/matrix.c:1543 >>>>>>> [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at >>>>>>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:2965 >>>>>>> [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at >>>>>>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:3163 >>>>>>> [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at >>>>>>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:3196 >>>>>>> [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at >>>>>>> .../petsc/src/mat/interface/matrix.c:7293 >>>>>>> [0]PETSC ERROR: #7 main() at sub.c:169 >>>>>>> >>>>>>> When the '-ok' option is selected, the code extracts a square matrix >>>>>>> for >>>>>>> colour 0, which runs smoothly in this case. Selecting the '-trans' >>>>>>> option swaps the row and column selection indices, providing a >>>>>>> transposed submatrix smoothly. For colour 1, which uses only one >>>>>>> process >>>>>>> and is therefore sequential, rectangular extraction is OK regardless of >>>>>>> the shape. >>>>>>> >>>>>>> Is this dependency on the shape expected? Have I missed an important >>>>>>> tuning step somewhere? >>>>>>> >>>>>>> Thank you in advance for any clarification. >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> A.S. >>>>>>> >>>>>>> P.S.: I'm sorry, but as I'm leaving my office for the following weeks >>>>>>> this evening, I won't be very responsive during this period. >>>>>>> >>>>>>> >