[MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

Lv, Tao A Tue, 09 Apr 2019 01:46:54 -0700


Hi dev,

As we're discussing the roadmap for MXNet 2.0, I would like to start a thread
about refining the InferStorageType and memory planning pass in MXNet and hope
it can happen as a part of the 2.0 release.

Thanks to @eric-haibin-lin, part of the proposal has already been discussed in
issue #13598 [1].

As mentioned in the description of issue #13598, there are several drawbacks of
the existing flow. Please allow me to quote them here:
* the selection of MKL/CPU/GPU/CUDNN implementation happens after graph
attribute inference and memory planning, memory planning is thus not aware of
the implementation that will be used for execution in the future, which may
result in sub-optimal result. For example, the memory inplace option may vary
depending on the accelerator backend (the new version of CUDNN enables x/dx
inplace for _backward_conv).
* some sparse operator need to access dtype/shape information to decide
which implementation to invoke for execution, and whether to perform fallback.
This information is not yet exposed in the existing infer storage type
interface.

Besides, the existing memory planning pass calculates and afterwards allocates
memory strictly according to the input/output tensor shapes (which can be got
from operators' arithmetic formulas through InferShape). That's not true
anymore when we come to accelerators like MKL-DNN on CPU which wants to pad
input/output tensor to optimal formats (eg. nchw16c) according to hardware
architecture. It also can be described as shape + stride. As many of you know,
MKL-DNN shows great performance on these optimal formats which is blocked by
the vector length of AVX512 or AVX2. It's very natural for us to pad on the
channel dimension for those inputs/outputs which IC or OC is not multiples of
vector length and leverage optimal kernels for blocked formats. Unfortunately
this cannot be implemented without changing the logic in the memory planning
pass. Currently we always fallback to slow reference kernels for both
convolution [1] and deconvolution [2].

AFAIK, the padding feature of MKL-DNN has already been used in TensorFlow and
other frameworks. We also found that, without supporting this feature, many
other new features from MKL-DNN cannot be applied to MXNet, such as the
deconvolution primitive, winograd, etc.

Changes for this proposal can be divided into following parts:
1. Following the proposal in issue #13598, we need add new
InferStorageTypeEx functions to operators which need to do dispatch in a more
fine-grained way. This also need the InfereStorage pass can handle the new -Ex
function as what we did for FCompute and FComputeEx.
2. Attach more information to the computation graph/node, eg. accelerator
specific information. Currently we add `IsMKLDNN` directly during operator
registration if MXNET_USE_MKLDNN == 1. It looks simple and rude to me.
3. Do memory planning according to more information: topology, shapes,
data types, in-place options and more accurate accelerator information
(accelerator path, memory size requirements, accelerator-wise attributes).
4. Improve MKL-DNN operators so they can work on those well planned memory
which may be larger than the arithmetic requirements and work with optimal
kernels. Also, with more accurate dispatching in InferStorageTypeEx, there is
no need for us to write complicated fallback logic in MKL-DNN operators.
5. If users feel uncomfortable with more memory usage, we can disable this
feature by environmental variables.

Since the memory planning pass is implemented in NNVM, so we also need support
from TVM community.

Please let me know what do you think. Thank you.

-tao

[1] https://github.com/apache/incubator-mxnet/issues/13598

[2]
https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_convolution.cc#L194

[3]
https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_deconvolution.cc#L55

[MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

Reply via email to