KellenSunderland commented on issue #14684: When setting 
MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION no speedup observed
URL: 
https://github.com/apache/incubator-mxnet/issues/14684#issuecomment-482669415
 
 
   Auto-tuning is overwriting the math mode at convolution tuning time.  
Probably the right thing to do when implementing TCs but it's preventing the 
conversion math type from being used.  We'll have to think about the long-term 
fix for this, but I've currently commented out the math type reset locally and 
I'm trying to verify this cudnn feature provides a significant speedup before 
moving forward.  New output logs from the CuDNN Api logging look a bit happier:
   
   ```
   I! CuDNN (v7301) function cudnnConvolutionForward() called:
   i!     handle: type=cudnnHandle_t; streamId=0x7f12280023a0;
   i!     alpha: type=CUDNN_DATA_FLOAT; val=1.000000;
   i!     xDesc: type=cudnnTensorDescriptor_t:
   i!         dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
   i!         nbDims: type=int; val=4;
   i!         dimA: type=int; val=[32,512,15,20];
   i!         strideA: type=int; val=[153600,300,20,1];
   i!     xData: location=dev; addr=0x7f1094000000;
   i!     wDesc: type=cudnnFilterDescriptor_t:
   i!         dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
   i!         vect: type=int; val=0;
   i!         nbDims: type=int; val=4;
   i!         dimA: type=int; val=[2048,512,1,1];
   i!         format: type=cudnnTensorFormat_t; val=CUDNN_TENSOR_NCHW (0);
   i!     wData: location=dev; addr=0x7f1141000000;
   i!     convDesc: type=cudnnConvolutionDescriptor_t:
   i!         mode: type=cudnnConvolutionMode_t; val=CUDNN_CROSS_CORRELATION 
(1);
   i!         dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
   i!         mathType: type=cudnnMathType_t; 
val=CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION (2);
   i!         arrayLength: type=int; val=2;
   i!         padA: type=int; val=[0,0];
   i!         strideA: type=int; val=[1,1];
   i!         dilationA: type=int; val=[1,1];
   i!         groupCount: type=int; val=1;
   i!     algo: type=cudnnConvolutionFwdAlgo_t; 
val=CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM (1);
   i!     workSpace: location=dev; addr=0x7f10bc000000;
   i!     workSpaceSizeInBytes: type=size_t; val=90572576;
   i!     beta: type=CUDNN_DATA_FLOAT; val=0.000000;
   i!     yDesc: type=cudnnTensorDescriptor_t:
   i!         dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
   i!         nbDims: type=int; val=4;
   i!         dimA: type=int; val=[32,2048,15,20];
   i!         strideA: type=int; val=[614400,300,20,1];
   i!     yData: location=dev; addr=0x7f1096000000;
   i! Time: 2019-04-12T11:02:48.313026 (0d+0h+0m+9s since start)
   i! Process=12161; Thread=12212; GPU=0; Handle=0x7f1228118830; 
StreamId=0x7f12280023a0.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to