Hi Tongke, GPU memory sometimes go very large and easily crash the GPU memory limit. So it require more frequent GC to solve the issue.
MXNet Java designed NDArray to be autoclosable which allow you to get memory GC'ed once the usage is done. MXNet (C API) have a reference counting system established below, but it cannot track the JVM object if it holds a piece of memory space. You will have to close the JVM object itself which call the Engine that the reference is not used to further clean this piece of memory. So the answer will be yes, you will need to manually managing the GPU NDArrays if it being used. Thanks, Qing ________________________________ From: TongKe Xue <tk...@tkxue.org> Sent: Thursday, October 15, 2020 8:29 To: dev@mxnet.apache.org <dev@mxnet.apache.org> Subject: How does mxnet efficiently free GPU memory ? Hi, In my very limited understanding: * GPU memory is often a bottleneck to training DL * Java, not being RAII / refcounted, does not have predictable destructors * overloading math ops + auto diff often creates transient GPU tensors that should later be freed Question: does mxnet have any automatic tracking of "this JVM object (1) is no longer reachable and (2) holds a GPU tensor, so we should free it" ? Thanks, --TongKe