Hi, In my very limited understanding:
* GPU memory is often a bottleneck to training DL * Java, not being RAII / refcounted, does not have predictable destructors * overloading math ops + auto diff often creates transient GPU tensors that should later be freed Question: does mxnet have any automatic tracking of "this JVM object (1) is no longer reachable and (2) holds a GPU tensor, so we should free it" ? Thanks, --TongKe