What percentage of the matrix entries are non-zero? If it more than, say 20 -30 percent nonzero then you should just use a dense matrix format. But then I see you are using a DMDA which implies it comes from some kind of mesh? Is there coupling more than nearest neighbor in the matrix. Are there 4 by 4 blocks in the matrix? You could use BAIJ matrices with a block size of 4.
Barry On Aug 16, 2013, at 5:14 PM, "Jin, Shuangshuang" <[email protected]> wrote: > Hello, Barry and Shri, thanks for your helps. > > To answer your question, n is 288. Usually there're greater than 1/8 nonzeros > in the matrix, so it's pretty dense. > > We didn't preallocate the J[4*n][4*n] matrix, it was created in each > IJacobian() call. I would like to create it in the main function, and pass it > to the IJacobian() function to reuse it for each time step. I guess in that > way it can save much time and memory usage? > > I have this piece of code in the main function: > > ierr = MatCreate(PETSC_COMM_WORLD, &J); CHKERRQ(ierr); // J: Jacobian matrix > ierr = MatSetSizes(J, PETSC_DECIDE, PETSC_DECIDE, 4*n, 4*n); CHKERRQ(ierr); > ierr = MatSetFromOptions(J); CHKERRQ(ierr); > ierr = MatSetUp(J); CHKERRQ(ierr); > > ierr = TSSetIJacobian(ts, J, J, (TSIJacobian) IJacobian, &user); > CHKERRQ(ierr); > > Shall I add the constants values for the J matrix here somewhere to use what > you mentioned " MatStoreValues() and MatRetrieveValues()"? > > Sorry I cannot post the Jacobian matrix equations here for nondisclosure > policy. But I can paste the framework of our IJacobian function with the > equations skipped for trouble shooting: > > PetscErrorCode Simulation::IJacobian(TS ts, PetscReal t, Vec X, Vec Xdot, > PetscReal a, Mat *A, Mat *B, MatStructure *flag, Userctx *ctx) > { > PetscLogStage stage = ctx->stage; > PetscLogStagePush(stage); > > PetscErrorCode ierr; > PetscInt n = ctx->n; > PetscScalar *x; > PetscScalar J[4*n][4*n]; > > PetscInt rowcol[4*n]; > PetscScalar val[n]; > > PetscInt i, j; > > DM da; > PetscInt xstart, xlen; > > int me; > double t0, t1; > > PetscFunctionBeginUser; > > ierr = TSGetDM(ts, &da); CHKERRQ(ierr); > > // Get pointers to vector data > MPI_Comm_rank(PETSC_COMM_WORLD, &me); > scatterMyVec(X, &x); CHKERRQ(ierr); > > // Get local grid boundaries > ierr = DMDAGetCorners(da, &xstart, NULL, NULL, &xlen, NULL, NULL); > CHKERRQ(ierr); > > > //////////////////////////////////////////////////////////////////////////////////////// > // This proves to be the most time-consuming block in the computation: > // Assign values to J matrix for the first 2*n rows (constant values) > ... (skipped) > > // Assign values to J matrix for the following 2*n rows (depends on X values) > for (i = 0; i < n; i++) { > for (j = 0; j < n; j++) { > ...(skipped) > } > > //////////////////////////////////////////////////////////////////////////////////////// > > for (i = 0; i < 4*n; i++) { > rowcol[i] = i; > } > > // Compute function over the locally owned part of the grid > for (i = xstart; i < xstart+xlen; i++) { > ierr = MatSetValues(*B, 1, &i, 4*n, rowcol, &J[i][0], INSERT_VALUES); > CHKERRQ(ierr); > } > > ierr = DMDAVecRestoreArray(da, X, &x); CHKERRQ(ierr); > > ierr = MatAssemblyBegin(*A, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > ierr = MatAssemblyEnd(*A, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > if (*A != *B) { > ierr = MatAssemblyBegin(*B, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > ierr = MatAssemblyEnd(*B, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > } > *flag = SAME_NONZERO_PATTERN; > > PetscLogStagePop(); > PetscFunctionReturn(0); > } > > Hope this code can show you a better picture of our problem here. > > Thanks, > Shuangshuang > > > -----Original Message----- > From: Barry Smith [mailto:[email protected]] > Sent: Friday, August 16, 2013 2:08 PM > To: Jin, Shuangshuang > Cc: Jed Brown; [email protected] > Subject: Re: [petsc-users] Performance of PETSc TS solver > > > On Aug 16, 2013, at 4:01 PM, "Jin, Shuangshuang" <[email protected]> > wrote: > >> Hello, Jed, in my IJacobian subroutine, I defined a PetscScalar J[4*n][4*n], >> and filled in the values for this J matrix by MatSetValues(). > > What is n? > > It should not be taking anywhere this much time. How sparse is the matrix? > Do you preallocate the nonzero structure? Do you reuse the same matrix for > each time step? >> >> 245 seconds out of the total 351 seconds in the DAE TS solving part are due >> to this J matrix computation. >> >> For that J matrix, half of them are constants values which doesn't change in >> each iteration. However, since my J is created inside each IJacobian() call, >> I couldn't reuse it. If that part of work belongs to redundant computation, >> I would like to know if there's a way to set up the Jacobian matrix outside >> of the IJacobian() subroutine, so that I can keep the constant part of >> values in J for all the iterations but only updates the changing values >> which depends on X? > > MatStoreValues() and MatRetrieveValues() but you can only call this after you > have assembled the matrix with the correct nonzero structure. So you need to > put the constants values in, put zeros in all the locations with non constant > values (that are not permeant zeros), call MatAssemblyBegin/End() then call > MatStoreValues() then for each computation of the Jacobian you first call > MatRetrieveValues() and then put in the non constant values. Then call > MatAssemblyBegin/End() > > Barry > >> >> Thanks, >> Shuangshuang >> >> -----Original Message----- >> From: Jed Brown [mailto:[email protected]] On Behalf Of Jed Brown >> Sent: Thursday, August 15, 2013 7:27 PM >> To: Jin, Shuangshuang >> Cc: [email protected] >> Subject: RE: [petsc-users] Performance of PETSc TS solver >> >> "Jin, Shuangshuang" <[email protected]> writes: >> >>> Hi, Jed, >>> >>> I followed your suggestion and profiled the IJacobian stage, please see the >>> related profile below: >> >> Cool, all of these are pretty inexpensive, so your time is probably in compu >
