Ingo Gaertner <[email protected]> writes: > I have 2*ndim+1 entries per row (including the diagonal). > In 2 dimensions, the suggested solution is 6 multiplications + 5 additions > = 11 flops per row. > The optimized solution is 4 multiplications + 3 additions = 7 flops per row.
Run it. Flops are not the performance limiting factor for these operations. Your algorithm still needs to traverse the matrix, which for 2D is 5*sizeof(PetscScalar)+6*sizeof(PetscInt) = 64 bytes per dof. It almost certainly does not cost less to apply the matrix than to apply it while skipping the diagonal entry. A custom implementation also will not be able to use vector-friendly matrix formats without extra masking which may indeed impact performance and in any case, would require a custom implementation for that format. I think you're wasting your time, but please run it to see. > I call this significant. > > Thanks > Ingo > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > Virenfrei. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > 2017-04-14 16:50 GMT+02:00 Jed Brown <[email protected]>: > >> Ingo Gaertner <[email protected]> writes: >> >> > Does PETSc include an efficient implementation for the operation >> > y=(A-diag(A))x or y_i=\sum_{j!=i}A_{ij}x_j on a sparse matrix A? >> > >> > In words, I need a matrix-vector product after the matrix diagonal has >> been >> > set to zero. For efficiency reasons I can't copy and modify the matrix or >> > first calculate the full product and then subtract the diagonal >> > contribution. >> >> How many entries per row in your matrix? There isn't a special function >> for this, but I'm skeptical that the performance gains of a custom >> implementation would be significant. Do you have a profile showing that >> it is? >>
signature.asc
Description: PGP signature
