I have an explanation but I'll have to think about the best fix.  The problem 
starts with the fact that cudnnFind() does its own workspace  allocations and 
doesn't use MXNet's memory allocator.  MXNet anticipates this by setting up a 
'headroom' via MXNET_GPU_MEM_POOL_RESERVE (a percentage of total memory). I was 
able to run your script with repeated allocations on a 16GB GPU by setting 
MXNET_GPU_MEM_POOL_RESERVE=35.  On a 12GB GPU, the corresponding value would be 
47!!  That's clearly excessive so we might have to resort to calling the 'Ex' 
flavor of cudnnFind, which allows for pre-screening of algos that have a 
workspace greater than the threshold set by the convolution instance 
'workspace' param.

[ Full content available at: 
https://github.com/apache/incubator-mxnet/issues/12662 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to