If the row lengths are not uniform, the threadcomm implementations of MatMult and such should be able to partition by some other metric, most obviously, the nonzeros. Since we currently keep thread offsets in PetscLayout, this is an obvious place to put it. To start the discussion, consider this interface where there is one integer-valued weight per block (set via PetscLayoutSetBlockSize).
PetscErrorCode PetscLayoutSetWeights(PetscLayout,const PetscInt[],PetscCopyMode); Reasonable? Should we allow PetscLayoutGetWeights to be implementable? If so, we have to make a copy of the integer weights. If not, we can pledge to discard our reference in PetscLayoutSetUp and then the caller (e.g., MatSeqAIJSetPreallocation_SeqAIJ) can just pass its const PetscInt[] along rather than needing to make a copy.
pgpS8MaVv3Y25.pgp
Description: PGP signature
