Ni Hui created MXNET-491:
----------------------------
Summary: Use depthwise convolution by cuDNNv7 if available,
updated version
Key: MXNET-491
URL: https://issues.apache.org/jira/browse/MXNET-491
Project: Apache MXNet
Issue Type: Improvement
Reporter: Ni Hui
Use group convolution by cuDNNv7 to improve GPU memory usage.
this pull request is based on #10804
with the following further changes:
reduce ident changes
prefer cudnn depthwise convolution over mxnet implementation
still use the explicit #if #else #endif statement over
the new variable effective_num_group solution for backward code path compability
because the new variable effective_num_group may confuse readers with standard
group convolution
some feedback about the speed
hardware: tesla-m40 24G x 2
system: centos-7
nvidia-387.26
cuda-9.1
cudnn-v7.1
model: mobilenet-v2
batchsize 256 (128 per gpu)
mxnet implementation: 68s/10iter
cudnnv7 implementation: 9.5s/10iter
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]