Rohan Yadav <roh...@alumni.cmu.edu> writes: > With modern GPU sizes, for example A100's with 80GB of memory, a vector of > length 2^31 is not that much memory -- one could conceivably run a CG solve > with local vectors > 2^31.
Yeah, each vector would be 8 GB (single precision) or 16 GB (double). You can't store a matrix of this size, and probably not a "mesh", but it's possible to create such a problem if everything is matrix-free (possibly with matrix-free geometric multigrid). This is more likely to show up in a benchmark than any real science or engineering probelm. We should support it, but it still seems hypothetical and not urgent. > Thanks Junchao, I might look into that. However, I currently am not trying > to solve such a large problem -- these questions just came from wondering > why the cuSPARSE kernel PETSc was calling was running faster than mine. Hah, bandwidth doesn't like. ;-)