barry-jin opened a new issue #20315:
URL: https://github.com/apache/incubator-mxnet/issues/20315
Looks like GPU memory will not be released after using `asnumpy()` method
for a large mxnet numpy ndarray with gpu context.
Code to reproduce:
```python
import mxnet as mx
from mxnet import npx, gluon
from mxnet.gluon import nn
npx.set_np()
mx.context._current.set(mx.gpu(0))
def check_layer_forward_withinput(net, x):
x_hybrid = x.copy()
x.attach_grad()
x_hybrid.attach_grad()
net.initialize()
with mx.autograd.record():
out1 = net(x)
out1.backward()
net.hybridize()
with mx.autograd.record():
out2 = net(x_hybrid)
out2.backward()
a, b = mx.context.gpu_memory_info(0)
print("Used memory {} GB, Total memory {} GB.".format((b - a) / (1024 *
1024 * 1024), b / (1024 * 1024 * 1024)))
mx.test_utils.assert_almost_equal(x.grad, x_hybrid.grad, rtol=1e-5,
atol=1e-6)
mx.test_utils.assert_almost_equal(out1, out2, rtol=1e-5, atol=1e-6)
def test_slice_pooling2d():
# transpose shape to bring feature dimension 'c' from 2nd position to
last
def transpose(shape):
return (shape[0],) + shape[2:] + (shape[1],)
for layout in ['NCHW', 'NHWC']:
max_pooling = nn.MaxPool2D(layout=layout)
avg_pooling = nn.AvgPool2D(layout=layout)
global_maxpooling = nn.GlobalMaxPool2D(layout=layout)
global_avgpooling = nn.GlobalAvgPool2D(layout=layout)
pooling_layers = [max_pooling, avg_pooling, global_maxpooling,
global_avgpooling]
class Net(gluon.HybridBlock):
def __init__(self,
slice,
pooling_layer,
**kwargs):
super(Net, self).__init__(**kwargs)
self.slice = slice
self.pool0 = pooling_layer
def forward(self, x):
x_slice = mx.npx.slice(x, begin=self.slice[0],
end=self.slice[1])
out = self.pool0(x_slice)
return out
xshape = (16, 128, 256, 256)
# xshape = (8, 64, 128, 128)
slice_shape = (4, 16, 32, 64)
if layout == 'NHWC':
xshape = transpose(xshape)
slice_shape = transpose(slice_shape)
x = mx.np.random.uniform(size=xshape)
slice = [(0, 0, 0, 0), slice_shape]
for i in range(len(pooling_layers)):
net = Net(slice, pooling_layers[i])
check_layer_forward_withinput(net, x)
if __name__ == '__main__':
test_slice_pooling2d()
```
Before comment out `mx.test_utils.assert_almost_equal()` these two lines:
```
Used memory 2.142578125 GB, Total memory 14.755615234375 GB.
Used memory 4.119140625 GB, Total memory 14.755615234375 GB.
Used memory 4.619140625 GB, Total memory 14.755615234375 GB.
Used memory 5.119140625 GB, Total memory 14.755615234375 GB.
Used memory 5.119140625 GB, Total memory 14.755615234375 GB.
Used memory 6.119140625 GB, Total memory 14.755615234375 GB.
Used memory 6.619140625 GB, Total memory 14.755615234375 GB.
Used memory 7.119140625 GB, Total memory 14.755615234375 GB.
```
After comment out these two lines:
```
Used memory 2.142578125 GB, Total memory 14.755615234375 GB.
Used memory 2.142578125 GB, Total memory 14.755615234375 GB.
Used memory 2.1171875 GB, Total memory 14.755615234375 GB.
Used memory 2.1171875 GB, Total memory 14.755615234375 GB.
Used memory 2.1171875 GB, Total memory 14.755615234375 GB.
Used memory 2.1171875 GB, Total memory 14.755615234375 GB.
Used memory 2.1171875 GB, Total memory 14.755615234375 GB.
Used memory 2.6171875 GB, Total memory 14.755615234375 GB.
```
After change xshape to a relatively smaller one (8, 64, 128, 128), the
memory usage looks normal.
_Originally posted by @barry-jin in
https://github.com/apache/incubator-mxnet/pull/20262#issuecomment-849217284_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]