Thanks for the suggestion Pierre. Yes B is duplicated by all processes.
In this case, should B be created as a sequential sparse matrix using COMM_SELF? I guess if not, the multiplication of B with the output of https://petsc.org/main/manualpages/Mat/MatMPIAIJGetLocalMat/ would not go through, right? Thanks, Thanos > On 23 Aug 2023, at 10:47, Pierre Jolivet <pierre.joli...@lip6.fr> wrote: > > > >> On 23 Aug 2023, at 5:35 PM, Thanasis Boutsikakis >> <thanasis.boutsika...@corintis.com> wrote: >> >> Hi all, >> >> I am trying to multiply two Petsc matrices as C = A * B, where A is a tall >> matrix and B is a relatively small matrix. >> >> I have taken the decision to create A as (row-)partitioned matrix and B as a >> non-partitioned matrix that it is entirely shared by all procs (to avoid >> unnecessary communication). >> >> Here is my code: >> >> import numpy as np >> from firedrake import COMM_WORLD >> from firedrake.petsc import PETSc >> from numpy.testing import assert_array_almost_equal >> >> nproc = COMM_WORLD.size >> rank = COMM_WORLD.rank >> >> def create_petsc_matrix_non_partitioned(input_array): >> """Building a mpi non-partitioned petsc matrix from an array >> >> Args: >> input_array (np array): Input array >> sparse (bool, optional): Toggle for sparese or dense. Defaults to >> True. >> >> Returns: >> mpi mat: PETSc matrix >> """ >> assert len(input_array.shape) == 2 >> >> m, n = input_array.shape >> >> matrix = PETSc.Mat().createAIJ(size=((m, n), (m, n)), comm=COMM_WORLD) >> >> # Set the values of the matrix >> matrix.setValues(range(m), range(n), input_array[:, :], addv=False) >> >> # Assembly the matrix to compute the final structure >> matrix.assemblyBegin() >> matrix.assemblyEnd() >> >> return matrix >> >> >> def create_petsc_matrix(input_array, partition_like=None): >> """Create a PETSc matrix from an input_array >> >> Args: >> input_array (np array): Input array >> partition_like (petsc mat, optional): Petsc matrix. Defaults to None. >> sparse (bool, optional): Toggle for sparese or dense. Defaults to >> True. >> >> Returns: >> petsc mat: PETSc matrix >> """ >> # Check if input_array is 1D and reshape if necessary >> assert len(input_array.shape) == 2, "Input array should be 2-dimensional" >> global_rows, global_cols = input_array.shape >> >> comm = COMM_WORLD >> if partition_like is not None: >> local_rows_start, local_rows_end = partition_like.getOwnershipRange() >> local_rows = local_rows_end - local_rows_start >> >> # No parallelization in the columns, set local_cols = None to >> parallelize >> size = ((local_rows, global_rows), (global_cols, global_cols)) >> else: >> size = ((None, global_rows), (global_cols, global_cols)) >> >> matrix = PETSc.Mat().createAIJ(size=size, comm=comm) >> matrix.setUp() >> >> local_rows_start, local_rows_end = matrix.getOwnershipRange() >> >> for counter, i in enumerate(range(local_rows_start, local_rows_end)): >> # Calculate the correct row in the array for the current process >> row_in_array = counter + local_rows_start >> matrix.setValues( >> i, range(global_cols), input_array[row_in_array, :], addv=False >> ) >> >> # Assembly the matrix to compute the final structure >> matrix.assemblyBegin() >> matrix.assemblyEnd() >> >> return matrix >> >> >> m, k = 10, 3 >> # Generate the random numpy matrices >> np.random.seed(0) # sets the seed to 0 >> A_np = np.random.randint(low=0, high=6, size=(m, k)) >> B_np = np.random.randint(low=0, high=6, size=(k, k)) >> >> >> A = create_petsc_matrix(A_np) >> >> B = create_petsc_matrix_non_partitioned(B_np) >> >> # Now perform the multiplication >> C = A * B >> >> The problem with this is that there is a mismatch between the local rows of >> A (depend on the partitioning) and the global rows of B (3 for all procs), >> so that the multiplication cannot happen in parallel. Here is the error: >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and >> https://petsc.org/release/faq/ >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and >> run >> [0]PETSC ERROR: to get more information on the crash. >> [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is >> causing the crash. >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=59 >> : >> system msg for write_line failure : Bad file descriptor >> >> >> Is there a standard way to achieve this? > > Your B is duplicated by all processes? > If so, then, call > https://petsc.org/main/manualpages/Mat/MatMPIAIJGetLocalMat/, do a sequential > product with B on COMM_SELF, not COMM_WORLD, and use > https://petsc.org/main/manualpages/Mat/MatCreateMPIMatConcatenateSeqMat/ with > the output. > > Thanks, > Pierre > >> Thanks, >> Thanos >> >>