Karl Rupp <[email protected]> writes: > I'm not talking about CSR vs. COO from the SpMV point of view, but > rather on how to store the actual data in global memory without > expensive subsequent sorts.
Sure, but this seems like such a minor detail. With PetscScalar=double and PetscInt=int, we have 16 bytes/entry for COO and (nominally) 12 bytes/entry for CSR, and it only needs to go to GPU global memory and back, not across to the CPU. I doubt the difference between 12 and 16 bytes/entry during assembly is a bottleneck.
pgpto7PBYiPW3.pgp
Description: PGP signature
