Don't worry about the extra memory of the copy; LU factorizations take much more space than the original matrix so this "duplicate" matrix is no big deal. At least to start, later if profiling shows it is a problem you could create a new matrix class that is already "joined" and feed that directly to cpardiso.
Barry On Sep 17, 2013, at 5:35 PM, Jose David Bermeol <[email protected]> wrote: > Hi, right now I'm coding a solver for MATMPIAIJ matrices, as you know this > kind of matrix has a diagonal matrix A of size n x n and a matrix B with the > off-diagonal matrices of size n x (N - n). The problem is that my solver for > each MPI process receive 3 arrays that represents the rows in the MPI > process(aij format), that means the complete n x N matrix. > > What I did was a join of the A and B matrix, however this implies almost to > duplicate everything in memory. > > I don't know if you know a better strategy for this(less memory overhead). > Thanks > > Here is my code if you see something that can be improve: > > /* > * MATMPIAIJ matrices are divided in two matrices, > * diagonal matrix(A) and off-diagonal matrix(B), > * both in aij format, as cpardiso only recives one matrix, > * this method will join A and B matrix in one, return aij > * representation and number of non-zero elements. > * > * This method will copy(duplicate) A and B elements in a new matrix. > * > * Input: > * - Mat A: MATMPIAIJ matrix > * - int shift: matrix index.(cpardiso requires fortan indexing) > * - 0 for c representation > * - 1 for fortran representation > * - MatReuse reuse: > * - MAT_INITIAL_MATRIX: Create a new aij representation > * - MAT_REUSE_MATRIX: Reuse all aij representation and > just change values > * Output: > * - int *nnz: Number of nonzero-elements. > * - int **r pointer to i index > * - int **c pointer to j elements > * - PetscScalar **v: Non-zero elements > */ > #undef __FUNCT__ > #define __FUNCT__ "MatJoin_CPARDISO" > PetscErrorCode MatJoin_CPARDISO(Mat A, int shift, MatReuse reuse, int *nnz, > int **r, int **c, PetscScalar **v){ > > const PetscInt *ai, *aj, *bi, *bj, *garray, m=A->rmap->n, *ajj, *bjj; > PetscErrorCode ierr; > PetscInt rstart, nz, i, jj, countA, countB, idxA, idxB; > PetscInt *row,*col; > const PetscScalar *av, *bv,*v1,*v2; > PetscScalar *val; > Mat_MPIAIJ *mat = (Mat_MPIAIJ*)A->data; > Mat_SeqAIJ *aa = (Mat_SeqAIJ*)(mat->A)->data; > Mat_SeqAIJ *bb = (Mat_SeqAIJ*)(mat->B)->data; > > PetscFunctionBegin; > ai=aa->i; aj=aa->j; bi=bb->i; bj=bb->j; rstart= A->rmap->rstart; > av=aa->a; bv=bb->a; > > garray = mat->garray; > > if (reuse == MAT_INITIAL_MATRIX) { > nz = aa->nz + bb->nz; > *nnz = nz; > ierr = PetscMalloc(((nz + m + 1)*sizeof(PetscInt) + > nz*sizeof(PetscScalar)), &row);CHKERRQ(ierr); > col = row + m + 1; > val = (PetscScalar*)(col + nz); > > *r = row; *c = col; *v = val; > } else { > row = *r; col = *c; val = *v; > } > > jj = 0; > for (i=0; i<m; i++) { > > countA = ai[i+1] - ai[i]; > countB = bi[i+1] - bi[i]; > ajj = aj + ai[i]; > bjj = bj + bi[i]; > v1 = av + ai[i]; > v2 = bv + bi[i]; > > if(reuse == MAT_INITIAL_MATRIX){ > row[i] = jj + shift; > } > idxA = 0; idxB = 0; > > if(countA + countB == 0){ > } else if(countB == 0){ > for (idxA = 0; idxA < countA; idxA++) { > if(reuse == MAT_INITIAL_MATRIX){ > col[jj] = rstart + ajj[idxA] + shift; > } > val[jj] = v1[idxA]; > jj++; > } > } else if(countA == 0){ > for (idxB = 0; idxB < countB; idxB++) { > if(reuse == MAT_INITIAL_MATRIX){ > col[jj] = garray[bjj[idxB]] + shift; > } > val[jj] = v2[idxB]; > jj++; > } > } else { > while(rstart + ajj[idxA] > garray[bjj[idxB]]){ > if(reuse == MAT_INITIAL_MATRIX){ > col[jj] = garray[bjj[idxB]] + shift; > } > val[jj] = v2[idxB]; > idxB++; > jj++; > } > > for (idxA = 0; idxA < countA; idxA++) { > if(reuse == MAT_INITIAL_MATRIX){ > col[jj] = rstart + ajj[idxA] + shift; > } > val[jj] = v1[idxA]; > jj++; > } > > for (; idxB < countB; idxB++) { > if(reuse == MAT_INITIAL_MATRIX){ > col[jj] = garray[bjj[idxB]] + shift; > } > val[jj] = v2[idxB]; > jj++; > } > } > } > if(reuse == MAT_INITIAL_MATRIX){ > row[m] = jj + shift; > } > PetscFunctionReturn(0); > } > > >
