Yes, a Nvidia Titan X is quite a common GPU for deep learning and comes with 12GB of VRAM, GTX 1080 Ti, 11GB and GTX 1070-1080 8GB. Whatever the GPU the goal is to saturate them with the biggest batch they can handle.
There is a lot of research going into more memory-efficient networks, and how to optimize memory usage without impacting accuracy. I did not came across this Rust allocators collection, great. All my Cuda Allocator research is detailed [there](https://github.com/mratsim/Arraymancer/issues/112)
