Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-10 Thread Sheng Zha
Relay is NNVM v2. The main difference between NNVM and Relay is that the former 
can represent control flow graph. Translating the suggested optimization pass 
in this thread from NNVM to relay should be straightforward. Given that I’d 
also suggest to start early with NNVM.

-sz

> On Apr 10, 2019, at 8:26 AM, Lv, Tao A  wrote:
> 
> 
> @Tianqi,
> 
> Thank you for the information. I will take a look on that to see if we can 
> take some advantages from it.
> 
> @Junru,
> 
> The reason for why we want to hold this change to 2.0 is that we know there 
> is a discussion in TVM community that NNVM will be deprecated soon and then I 
> think MXNet has to move to a new IR either NNVM v2 or Relay. As most changes 
> in this proposal are related to IR passes, we definitely don't want to spend 
> much effort on something which is deprecating. 2.0 seems to be a more 
> appropriate timing for us to make these changes. But I agree with you, we can 
> start to do some experiments on the existing architects and NNVM IR.
> 
> -tao
> 
> -Original Message-
> From: Junru Shao [mailto:junrushao1...@gmail.com] 
> Sent: Wednesday, April 10, 2019 1:34 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and 
> memory planning pass
> 
> Agreed with Tianqi that we could have better implementation once we have 
> better tvm nnvm v2 integration. For now I believe that we shouldn't block the 
> development of Intel folks.
> 
> On Tue, Apr 9, 2019 at 10:10 PM Tianqi Chen 
> wrote:
> 
>> Such kind of conversion can be viewed as an enhanced version of 
>> AlterOpLayout in the TVM relay Pass
>> 
>>> On Tue, Apr 9, 2019 at 8:03 PM Lv, Tao A  wrote:
>>> 
>>> 
>>> Thank you Tianqi and Sam for the kind suggestions.
>>> 
>>> @Tianqi,
>>> 
>>> Can you please point me to the code of this pass or do you think 
>>> anyone from TVM community can help to educate me on this? I'm very 
>>> happy to
>> learn
>>> from that.
>>> 
>>> Just one note, we are not only doing layout transformation but also 
>>> want to have more memory for layout transformation.
>>> For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, 
>>> C=16, H=256, W=256) on channel dimension then convert (N=32, C=16, 
>>> H=256,
>> W=256)
>>> to nchw16c so we can leverage corresponding optimal computation kernels.
>>> That's why we also need changes to the memory planning pass.
>>> 
>>> 
>>> @Sam,
>>> 
>>> Yes, definitely we're treating MKL-DNN as an accelerator on CPU.
>>> Previously we used it to accelerate certain critical operators in 
>>> MXNet
>> in
>>> certain situations, eg. FP32 
>>> convolution/deconvolution/fullyConnected,
>> etc.
>>> But along with the evolving of both MXNet and MKL-DNN, we started to 
>>> do more which might not supported by MXNet in original CPU 
>>> implementation, such as quantization and graph fusion. So MKL-DNN 
>>> backend is also
>> changing
>>> from a simple `accelerator` to a `default` backend on CPU. And I 
>>> totally agree with you that we need think more about the software 
>>> architecture
>> for
>>> maintainability, testability and readability - that's why I sent out 
>>> this proposal to get more ideas from the community.
>>> 
>>> 
>>> -tao
>>> 
>>> -Original Message-
>>> From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
>>> Sent: Wednesday, April 10, 2019 2:24 AM
>>> To: dev@mxnet.incubator.apache.org
>>> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the 
>>> InferStorageType and memory planning pass
>>> 
>>> I agree with Tianqi. We should let MKLDNN partitipate in memory 
>>> planning by first having a separate NNVM pass and then using that 
>>> info in the regular memory planning phase.
>>> 
>>> Its starting to sound like MKLDNN should be treated like an 
>>> accelerator rather than an operator library. As it has explicit 
>>> needs and can provide acceleration when given extra capabilities in 
>>> MXNet like having input to the memory planning NNVM pass. It also 
>>> has special tensor formatting
>> needs
>>> and conversions that could be best architected in another way than 
>>> they currently are.
>>> 
>>> We need to think about how we want to architect this for 
>>> maintainability, testability, and readability.
>>> 
>>> Sam
>>> 
>>> 
 On Apr 9, 2019, at 11:11 AM, Tianqi Chen 
 
>>> wrote:
 
 The layout transformation should really be a separate optimization 
 pass rather than memory planning. As is done in the TVM stack. If 
 we want to do a clean slate solution, I would recommend looking 
 into that
>>> instead.
 
 TIanqi
 
> On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> 
> 
> 
> Hi dev,
> 
> 
> 
> As we're discussing the roadmap for MXNet 2.0, I would like to 
> start a thread about refining the InferStorageType and memory 
> planning pass in MXNet and hope it can happen as a part of the 2.0 
> release.
> 
> 
> 
> Thanks to 

RE: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-10 Thread Lv, Tao A

@Tianqi,

Thank you for the information. I will take a look on that to see if we can take 
some advantages from it.

@Junru,

The reason for why we want to hold this change to 2.0 is that we know there is 
a discussion in TVM community that NNVM will be deprecated soon and then I 
think MXNet has to move to a new IR either NNVM v2 or Relay. As most changes in 
this proposal are related to IR passes, we definitely don't want to spend much 
effort on something which is deprecating. 2.0 seems to be a more appropriate 
timing for us to make these changes. But I agree with you, we can start to do 
some experiments on the existing architects and NNVM IR.

-tao

-Original Message-
From: Junru Shao [mailto:junrushao1...@gmail.com] 
Sent: Wednesday, April 10, 2019 1:34 PM
To: dev@mxnet.incubator.apache.org
Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and 
memory planning pass

Agreed with Tianqi that we could have better implementation once we have better 
tvm nnvm v2 integration. For now I believe that we shouldn't block the 
development of Intel folks.

On Tue, Apr 9, 2019 at 10:10 PM Tianqi Chen 
wrote:

> Such kind of conversion can be viewed as an enhanced version of 
> AlterOpLayout in the TVM relay Pass
>
> On Tue, Apr 9, 2019 at 8:03 PM Lv, Tao A  wrote:
>
> >
> > Thank you Tianqi and Sam for the kind suggestions.
> >
> > @Tianqi,
> >
> > Can you please point me to the code of this pass or do you think 
> > anyone from TVM community can help to educate me on this? I'm very 
> > happy to
> learn
> > from that.
> >
> > Just one note, we are not only doing layout transformation but also 
> > want to have more memory for layout transformation.
> > For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, 
> > C=16, H=256, W=256) on channel dimension then convert (N=32, C=16, 
> > H=256,
> W=256)
> > to nchw16c so we can leverage corresponding optimal computation kernels.
> > That's why we also need changes to the memory planning pass.
> >
> >
> > @Sam,
> >
> > Yes, definitely we're treating MKL-DNN as an accelerator on CPU.
> > Previously we used it to accelerate certain critical operators in 
> > MXNet
> in
> > certain situations, eg. FP32 
> > convolution/deconvolution/fullyConnected,
> etc.
> > But along with the evolving of both MXNet and MKL-DNN, we started to 
> > do more which might not supported by MXNet in original CPU 
> > implementation, such as quantization and graph fusion. So MKL-DNN 
> > backend is also
> changing
> > from a simple `accelerator` to a `default` backend on CPU. And I 
> > totally agree with you that we need think more about the software 
> > architecture
> for
> > maintainability, testability and readability - that's why I sent out 
> > this proposal to get more ideas from the community.
> >
> >
> > -tao
> >
> > -Original Message-
> > From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
> > Sent: Wednesday, April 10, 2019 2:24 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the 
> > InferStorageType and memory planning pass
> >
> > I agree with Tianqi. We should let MKLDNN partitipate in memory 
> > planning by first having a separate NNVM pass and then using that 
> > info in the regular memory planning phase.
> >
> > Its starting to sound like MKLDNN should be treated like an 
> > accelerator rather than an operator library. As it has explicit 
> > needs and can provide acceleration when given extra capabilities in 
> > MXNet like having input to the memory planning NNVM pass. It also 
> > has special tensor formatting
> needs
> > and conversions that could be best architected in another way than 
> > they currently are.
> >
> > We need to think about how we want to architect this for 
> > maintainability, testability, and readability.
> >
> > Sam
> >
> >
> > > On Apr 9, 2019, at 11:11 AM, Tianqi Chen 
> > > 
> > wrote:
> > >
> > > The layout transformation should really be a separate optimization 
> > > pass rather than memory planning. As is done in the TVM stack. If 
> > > we want to do a clean slate solution, I would recommend looking 
> > > into that
> > instead.
> > >
> > > TIanqi
> > >
> > > On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> > >
> > >>
> > >>
> > >> Hi dev,
> > >>
> > >>
> > >>
> > >> As we're discussing the roadmap for MXNet 2.0, I would like to 
> > >> start a thread about refining the InferStorageType and memory 
> > >> planning pass in MXNet and hope it can happen as a part of the 2.0 
> > >> release.
> > >>
> > >>
> > >>
> > >> Thanks to @eric-haibin-lin, part of the proposal has already been 
> > >> discussed in issue #13598 [1].
> > >>
> > >>
> > >>
> > >> As mentioned in the description of issue #13598, there are 
> > >> several drawbacks of the existing flow. Please allow me to quote them 
> > >> here:
> > >> *the selection of MKL/CPU/GPU/CUDNN implementation happens
> after
> > >> graph attribute inference and memory planning, memory