Hi Qing, I think I am understanding something very basic. Perhaps we can work through this example:
At https://mxnet.apache.org/versions/1.7/api/java/docs/api/#org.apache.mxnet.javaapi.NDArray we see a function of signature: def add(other: NDArray): NDArray Suppose we have (x: NDArray), (y: NDArray), (z: NDArray), all of the right dimensions and GPU backed. Furthermore, suppose we do: out = x * 2.0 + (y * 3.0) + z My intuition is that this generates temporary values t1, t2, t3 where: t1 = x * 2.0 t2 = y * 3.0 t3 = t1 + t2 out = t3 + z However, I am not manually calling dispose on any of t1, t2, t3. Is this resulting in a memory leak? --TongKe On Thu, Oct 15, 2020 at 9:25 AM Qing Lan <lanking...@live.com> wrote: > Hi Tongke, > > GPU memory sometimes go very large and easily crash the GPU memory limit. > So it require more frequent GC to solve the issue. > > MXNet Java designed NDArray to be autoclosable which allow you to get > memory GC'ed once the usage is done. > > MXNet (C API) have a reference counting system established below, but it > cannot track the JVM object if it holds a piece of memory space. You will > have to close the JVM object itself which call the Engine that the > reference is not used to further clean this piece of memory. So the answer > will be yes, you will need to manually managing the GPU NDArrays if it > being used. > > Thanks, > Qing > > ________________________________ > From: TongKe Xue <tk...@tkxue.org> > Sent: Thursday, October 15, 2020 8:29 > To: dev@mxnet.apache.org <dev@mxnet.apache.org> > Subject: How does mxnet efficiently free GPU memory ? > > Hi, > > In my very limited understanding: > > * GPU memory is often a bottleneck to training DL > * Java, not being RAII / refcounted, does not have predictable > destructors > * overloading math ops + auto diff often creates transient GPU tensors > that should later be freed > > Question: does mxnet have any automatic tracking of "this JVM object (1) > is no longer reachable and (2) holds a GPU tensor, so we should free it" ? > > Thanks, > --TongKe >