There is already a library called cuTensor, but it is not this "pre-release", which I can't find on the internet.
http://www.tensorlet.com/cutensor/ It might be useful for high order elements if they can support fusing enough kernels, but is probably of limited utility if each contraction implies a kernel launch. "Smith, Barry F. via petsc-dev" <[email protected]> writes: > Is this relevant for anything upcoming in PETSc? > > Barry > > > From: Timothy Costa [email protected] > Date: April 02, 2019 > Subject: Pre-Release: NVIDIA cuTENSOR library for accelerating tensor > operations > > cuTENSOR is a new library containing highly optimized tensor > primitives for NVIDIA GPUs. It provides a set of simple, flexible APIs > for elementwise tensor operations and tensor contractions. cuTENSOR's > expressive API allows for elementwise operation fusion and exposes > several tensor contraction algorithms. cuTENSOR is now available in an > apply-for-access pre-release at developer.nvidia.com/cutensor. Apply > today!
