Ni Hui created MXNET-491:
----------------------------

             Summary: Use depthwise convolution by cuDNNv7 if available, 
updated version
                 Key: MXNET-491
                 URL: https://issues.apache.org/jira/browse/MXNET-491
             Project: Apache MXNet
          Issue Type: Improvement
            Reporter: Ni Hui


Use group convolution by cuDNNv7 to improve GPU memory usage.
this pull request is based on #10804
with the following further changes:

    reduce ident changes
    prefer cudnn depthwise convolution over mxnet implementation

still use the explicit #if #else #endif statement over
the new variable effective_num_group solution for backward code path compability
because the new variable effective_num_group may confuse readers with standard 
group convolution

    some feedback about the speed

    hardware: tesla-m40 24G x 2
    system: centos-7
    nvidia-387.26
    cuda-9.1
    cudnn-v7.1

    model: mobilenet-v2
    batchsize 256 (128 per gpu)

    mxnet implementation: 68s/10iter
    cudnnv7 implementation: 9.5s/10iter





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to