Matthew Knepley <[email protected]> writes: >> These users compute redundantly and set MAT_NO_OFF_PROC_ENTRIES. > > > Fine, we should have a flag like that.
We do, it's called MAT_NO_OFF_PROC_ENTRIES. >> What if you do the sort/reduce thing within thread blocks, and only >> write the reduced version to global storage? >> > > I think it should be easy, but we will have to see what is out there for > thread blocks. Cool. Maybe we'll be able to work up a proof of concept this week, will see how it goes tomorrow.
pgpUVqfbHiWFD.pgp
Description: PGP signature
