Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Anirudh Subramanian
Hi all,

I have created a doc for conversion from FP32 to Mixed Precision Models:
https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models

I look forward to your feedback on the same.

Thanks,
Anirudh


RE: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Lv, Tao A
Thank you for sharing this, Anirudh.

Curious to know:
- what will be saved in a training checkpoint or snapshot? Can it be resumed on 
another platform which might not support the lower precision the previous one 
used?
- what will be saved in the final symbol.json and params file when training is 
finished?
- more generally, what will be saved when users want to serialize their model 
to disk?

Thank you,
-tao

-Original Message-
From: Anirudh Subramanian [mailto:anirudh2...@gmail.com] 
Sent: Monday, April 29, 2019 7:00 PM
To: dev@mxnet.incubator.apache.org
Subject: Proposal for Conversion from FP32 to Mixed Precision Models

Hi all,

I have created a doc for conversion from FP32 to Mixed Precision Models:
https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models

I look forward to your feedback on the same.

Thanks,
Anirudh


Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Anirudh Subramanian
Hi Tao,

The APIs proposed: "convert_model" and "convert_block" are mainly for
inference use cases, where customers bring a FP32 model to convert it to a
mixed precision model to get improved performance while not losing out on
the accuracy.
The PR: https://github.com/apache/incubator-mxnet/pull/14173 is supposed to
handle the training use cases and this proposal doesn't cover the AMP
feature added in the PR. I think ptrendx@ and canoerst@ are better equipped
to answer questions 1 and 2.

> - more generally, what will be saved when users want to serialize their
model to disk?

Lets say users want to save converted mixed precision model used for
inference to disk. It will save both, the symbol with the amp_cast and
amp_multicast operators and the params (which are casted if necessary).

Anirudh


On Mon, Apr 29, 2019 at 6:55 AM Lv, Tao A  wrote:

> Thank you for sharing this, Anirudh.
>
> Curious to know:
> - what will be saved in a training checkpoint or snapshot? Can it be
> resumed on another platform which might not support the lower precision the
> previous one used?
> - what will be saved in the final symbol.json and params file when
> training is finished?
> - more generally, what will be saved when users want to serialize their
> model to disk?
>
> Thank you,
> -tao
>
> -Original Message-
> From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> Sent: Monday, April 29, 2019 7:00 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Proposal for Conversion from FP32 to Mixed Precision Models
>
> Hi all,
>
> I have created a doc for conversion from FP32 to Mixed Precision Models:
>
> https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models
>
> I look forward to your feedback on the same.
>
> Thanks,
> Anirudh
>


RE: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Lv, Tao A
Thank you for the explanation. Sorry I didn't realize the proposal is for 
inference only.

Then how do you think the amp_cast and amp_multicast in this proposal can work 
with the existing INT8 quantization workflow which I think should also be 
considered as 'mixed precision'.

-Original Message-
From: Anirudh Subramanian [mailto:anirudh2...@gmail.com] 
Sent: Monday, April 29, 2019 10:25 PM
To: dev@mxnet.incubator.apache.org
Subject: Re: Proposal for Conversion from FP32 to Mixed Precision Models

Hi Tao,

The APIs proposed: "convert_model" and "convert_block" are mainly for inference 
use cases, where customers bring a FP32 model to convert it to a mixed 
precision model to get improved performance while not losing out on the 
accuracy.
The PR: https://github.com/apache/incubator-mxnet/pull/14173 is supposed to 
handle the training use cases and this proposal doesn't cover the AMP feature 
added in the PR. I think ptrendx@ and canoerst@ are better equipped to answer 
questions 1 and 2.

> - more generally, what will be saved when users want to serialize 
> their
model to disk?

Lets say users want to save converted mixed precision model used for inference 
to disk. It will save both, the symbol with the amp_cast and amp_multicast 
operators and the params (which are casted if necessary).

Anirudh


On Mon, Apr 29, 2019 at 6:55 AM Lv, Tao A  wrote:

> Thank you for sharing this, Anirudh.
>
> Curious to know:
> - what will be saved in a training checkpoint or snapshot? Can it be 
> resumed on another platform which might not support the lower 
> precision the previous one used?
> - what will be saved in the final symbol.json and params file when 
> training is finished?
> - more generally, what will be saved when users want to serialize 
> their model to disk?
>
> Thank you,
> -tao
>
> -Original Message-
> From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> Sent: Monday, April 29, 2019 7:00 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Proposal for Conversion from FP32 to Mixed Precision Models
>
> Hi all,
>
> I have created a doc for conversion from FP32 to Mixed Precision Models:
>
> https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32
> +to+Mixed+Precision+Models
>
> I look forward to your feedback on the same.
>
> Thanks,
> Anirudh
>


Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Anirudh Subramanian
Hi Tao,

Thanks for raising this question! I thought about the existing quantization
workflow and whether it can be included with the AMP API. Although
quantization can be considered as mixed precision, there are differences.
For example, only a small number of operators can be quantized compared to
the operators that can run in FP16 precision. Thus, overriding the
operators to run in original dtype vs target dtype doesnt make much sense
for quantization.

Also, quantization workflow may require a calibration dataset to calibrate
the min and max and calib_mode.
Arriving at a common API, for quantization with calibration and mixed
precision inference (FP16 and BF16) may make the API too complicated and
not very easy to use. I understand that this may cause some confusion as
people may try to use target_dtype of int8 but I think its still better
than causing user confusion with the API usage.

Also, when we move quantize_model APIs outside contrib we can consider
adding them under AMP namespace. The challenge would then be to educate
users on difference between "quantize" and "convert".

Anirudh

On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A  wrote:

> Thank you for the explanation. Sorry I didn't realize the proposal is for
> inference only.
>
> Then how do you think the amp_cast and amp_multicast in this proposal can
> work with the existing INT8 quantization workflow which I think should also
> be considered as 'mixed precision'.
>
> -Original Message-
> From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> Sent: Monday, April 29, 2019 10:25 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: Proposal for Conversion from FP32 to Mixed Precision Models
>
> Hi Tao,
>
> The APIs proposed: "convert_model" and "convert_block" are mainly for
> inference use cases, where customers bring a FP32 model to convert it to a
> mixed precision model to get improved performance while not losing out on
> the accuracy.
> The PR: https://github.com/apache/incubator-mxnet/pull/14173 is supposed
> to handle the training use cases and this proposal doesn't cover the AMP
> feature added in the PR. I think ptrendx@ and canoerst@ are better
> equipped to answer questions 1 and 2.
>
> > - more generally, what will be saved when users want to serialize
> > their
> model to disk?
>
> Lets say users want to save converted mixed precision model used for
> inference to disk. It will save both, the symbol with the amp_cast and
> amp_multicast operators and the params (which are casted if necessary).
>
> Anirudh
>
>
> On Mon, Apr 29, 2019 at 6:55 AM Lv, Tao A  wrote:
>
> > Thank you for sharing this, Anirudh.
> >
> > Curious to know:
> > - what will be saved in a training checkpoint or snapshot? Can it be
> > resumed on another platform which might not support the lower
> > precision the previous one used?
> > - what will be saved in the final symbol.json and params file when
> > training is finished?
> > - more generally, what will be saved when users want to serialize
> > their model to disk?
> >
> > Thank you,
> > -tao
> >
> > -Original Message-
> > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> > Sent: Monday, April 29, 2019 7:00 PM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Proposal for Conversion from FP32 to Mixed Precision Models
> >
> > Hi all,
> >
> > I have created a doc for conversion from FP32 to Mixed Precision Models:
> >
> > https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32
> > +to+Mixed+Precision+Models
> >
> > I look forward to your feedback on the same.
> >
> > Thanks,
> > Anirudh
> >
>


Re: [RFC] Support for creation of Large Tensors in MXNet

2019-04-29 Thread Lin Yuan
Tao,

- what's the max size of dimensionality? Which data type is used to define
dimensionality (ndims)?
We assume the max size of dimensionality is relatively small. Hence `int`
data type is used to define ndim

- what's the max size of each dimension? Which data type is used to define
dimension size (shape[x])?
Currently, we assume the max size of each dimension is not going to exceed
2^31 in real applications. Hence the data type is `int32_t`

- what's the max size of total elements? Which data type is used to define
element size (Prod(shape))?
We assume the total number of elements in a tensor can be larger than 2^32
in some applications such as deep graph library. We use the data type
`int64_t` to represent the total element size. Currently due to performance
regression in some operators (such as transpose), we used a compiler flag
to set this data type to `int32_t` by default. Once we have ways to
mitigate the performance regression, we will set the default data type to
`int64_t`, which is part of the effort in this project that Rohit proposed.

What is the plan in MKLDNN to support large tensors? We may want to
coordinate the progress since many operators are using MKLDNN
implementation in CPU now.

Many Thanks,

Lin

On Sun, Apr 28, 2019 at 7:52 PM Lv, Tao A  wrote:

> Thank you for bringing this topic to dev, Rohit.
>
> Regarding large tensor, can you articulate:
> - what's the max size of dimensionality? Which data type is used to define
> dimensionality (ndims)?
> - what's the max size of each dimension? Which data type is used to define
> dimension size (shape[x])?
> - what's the max size of total elements? Which data type is used to define
> element size (Prod(shape))?
>
> For me, any of these three can be *large*.
>
> -Original Message-
> From: Srivastava, Rohit Kumar [mailto:srivastava@buckeyemail.osu.edu]
> Sent: Saturday, April 27, 2019 7:33 AM
> To: dev@mxnet.incubator.apache.org
> Subject: [RFC] Support for creation of Large Tensors in MXNet
>
> Dear Community,
>
> Currently MXNet supports creation of Tensors containing up to 2^32
> elements. However there are cases where tensors of size over 5 billion is
> required
>
> We plan to support creation of large tensors on MXNet. A design proposal
> is ready for review:
> https://cwiki.apache.org/confluence/display/MXNET/Large+Tensor+Support
>
> We will appreciate any help and feedbacks from the community.
>
> Thank you!
>
> Rohit
>


Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Zach Kimberg
I have one suggestion. In the current design, there are the additional maps
from each input entry to each target casted entry dtype in order to avoid
creating duplicate casts. Instead of creating these, another option is to
use a general purpose Common Subexpression Elimination (CSE) [1] pass to
apply afterwards. So, you would run the mixed precision pass which creates
the duplicates and then the CSE pass which would remove all duplicates.

This design is common in existing compilers like LLVM because maintaining
and testing the passes is much easier when they are kept as simple as
possible. The CSE can also be reused as necessary for other passes that
could create duplicates or to remove duplicate expressions in general. This
tutorial [2] talks about it a bit.

Zach

[1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination
[2] - https://blog.regehr.org/archives/1603

On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian 
wrote:

> Hi Tao,
>
> Thanks for raising this question! I thought about the existing quantization
> workflow and whether it can be included with the AMP API. Although
> quantization can be considered as mixed precision, there are differences.
> For example, only a small number of operators can be quantized compared to
> the operators that can run in FP16 precision. Thus, overriding the
> operators to run in original dtype vs target dtype doesnt make much sense
> for quantization.
>
> Also, quantization workflow may require a calibration dataset to calibrate
> the min and max and calib_mode.
> Arriving at a common API, for quantization with calibration and mixed
> precision inference (FP16 and BF16) may make the API too complicated and
> not very easy to use. I understand that this may cause some confusion as
> people may try to use target_dtype of int8 but I think its still better
> than causing user confusion with the API usage.
>
> Also, when we move quantize_model APIs outside contrib we can consider
> adding them under AMP namespace. The challenge would then be to educate
> users on difference between "quantize" and "convert".
>
> Anirudh
>
> On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A  wrote:
>
> > Thank you for the explanation. Sorry I didn't realize the proposal is for
> > inference only.
> >
> > Then how do you think the amp_cast and amp_multicast in this proposal can
> > work with the existing INT8 quantization workflow which I think should
> also
> > be considered as 'mixed precision'.
> >
> > -Original Message-
> > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> > Sent: Monday, April 29, 2019 10:25 PM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: Proposal for Conversion from FP32 to Mixed Precision Models
> >
> > Hi Tao,
> >
> > The APIs proposed: "convert_model" and "convert_block" are mainly for
> > inference use cases, where customers bring a FP32 model to convert it to
> a
> > mixed precision model to get improved performance while not losing out on
> > the accuracy.
> > The PR: https://github.com/apache/incubator-mxnet/pull/14173 is supposed
> > to handle the training use cases and this proposal doesn't cover the AMP
> > feature added in the PR. I think ptrendx@ and canoerst@ are better
> > equipped to answer questions 1 and 2.
> >
> > > - more generally, what will be saved when users want to serialize
> > > their
> > model to disk?
> >
> > Lets say users want to save converted mixed precision model used for
> > inference to disk. It will save both, the symbol with the amp_cast and
> > amp_multicast operators and the params (which are casted if necessary).
> >
> > Anirudh
> >
> >
> > On Mon, Apr 29, 2019 at 6:55 AM Lv, Tao A  wrote:
> >
> > > Thank you for sharing this, Anirudh.
> > >
> > > Curious to know:
> > > - what will be saved in a training checkpoint or snapshot? Can it be
> > > resumed on another platform which might not support the lower
> > > precision the previous one used?
> > > - what will be saved in the final symbol.json and params file when
> > > training is finished?
> > > - more generally, what will be saved when users want to serialize
> > > their model to disk?
> > >
> > > Thank you,
> > > -tao
> > >
> > > -Original Message-
> > > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> > > Sent: Monday, April 29, 2019 7:00 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: Proposal for Conversion from FP32 to Mixed Precision Models
> > >
> > > Hi all,
> > >
> > > I have created a doc for conversion from FP32 to Mixed Precision
> Models:
> > >
> > > https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32
> > > +to+Mixed+Precision+Models
> > >
> > > I look forward to your feedback on the same.
> > >
> > > Thanks,
> > > Anirudh
> > >
> >
>


Re: Clojure MXNet Monthly Update

2019-04-29 Thread Pedro Larroy
nice!  I would suggest that we use the medium account for MXNet and
have a "this month in MXNet" as they do in other open source projects
and just have a clojure section? I think it gives nice visibility to
the project and attracts contributors. Maybe just copy your updates to
the "main one"?

Pedro.


On Fri, Apr 26, 2019 at 1:45 PM Carin Meier  wrote:
>
> I've started a monthly blog update targeted for the Clojure community but I
> thought I would share it here too :)
>
> http://gigasquidsoftware.com/blog/2019/04/26/clojure-mxnet-april-update/
>
> http://gigasquidsoftware.com/blog/2019/03/22/clojure-mxnet-march-update/
>
> Best,
> Carin


Re: Clojure MXNet Monthly Update

2019-04-29 Thread Carin Meier
Thanks for the feedback. I remember a wiki page for the monthly updates but
I can't remember the exact location. Does anyone have a link handy?

I would be happy to add the updates to a Clojure section of the "main" one.
I will most likely continue to put out a targeted one for the Clojure
community as well, since I think that is valuable. Not to mention the fact,
I don't think my cat pictures would work in the main document :)

- Carin

On Mon, Apr 29, 2019 at 3:36 PM Pedro Larroy 
wrote:

> nice!  I would suggest that we use the medium account for MXNet and
> have a "this month in MXNet" as they do in other open source projects
> and just have a clojure section? I think it gives nice visibility to
> the project and attracts contributors. Maybe just copy your updates to
> the "main one"?
>
> Pedro.
>
>
> On Fri, Apr 26, 2019 at 1:45 PM Carin Meier  wrote:
> >
> > I've started a monthly blog update targeted for the Clojure community
> but I
> > thought I would share it here too :)
> >
> > http://gigasquidsoftware.com/blog/2019/04/26/clojure-mxnet-april-update/
> >
> > http://gigasquidsoftware.com/blog/2019/03/22/clojure-mxnet-march-update/
> >
> > Best,
> > Carin
>


[Proposal] MXNet operator benchmark library

2019-04-29 Thread sandeep krishnamurthy
Hello Community,

I am currently working on building a utility/library to help us easily do
individual operator benchmarking in MXNet. I have documented the proposal in
this cwiki
,
and staging the current development in this github repository
. Proposal
is to get this library under incubator-mxnet/benchmark/
. Please
do review and provide your feedback and suggestions.

Thanks to fellow MXNet community members - Lin, Sam, Rohit for providing
initial ideas and suggestion.

Best,
Sandeep




-- 
Sandeep Krishnamurthy


Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Anirudh Subramanian
Hi Zach,

You raise an interesting point. Thank you for the pointer!

Incorporating CSE pass comes with its own cost, and the advantage it brings
is to make the ReducePrecision nnvm pass more lightweight. Since the
amortized cost of the ReducePrecision pass is O(1) it shouldn't matter much
whether we  add it or not from performance point of view.

>From maintenance point of view, I would agree that separating these two
logics can be helpful if we have other such workflows which require the
original Pass followed by CSE pass. Currently, as far as I know only the
ReducePrecision pass using it. I will check to see if CSE pass can benefit
other NNVM pass also like quantization pass apart from ReducePrecision, and
will get back.

Anirudh

On Mon, Apr 29, 2019 at 11:18 AM Zach Kimberg 
wrote:

> I have one suggestion. In the current design, there are the additional maps
> from each input entry to each target casted entry dtype in order to avoid
> creating duplicate casts. Instead of creating these, another option is to
> use a general purpose Common Subexpression Elimination (CSE) [1] pass to
> apply afterwards. So, you would run the mixed precision pass which creates
> the duplicates and then the CSE pass which would remove all duplicates.
>
> This design is common in existing compilers like LLVM because maintaining
> and testing the passes is much easier when they are kept as simple as
> possible. The CSE can also be reused as necessary for other passes that
> could create duplicates or to remove duplicate expressions in general. This
> tutorial [2] talks about it a bit.
>
> Zach
>
> [1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination
> [2] - https://blog.regehr.org/archives/1603
>
> On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian  >
> wrote:
>
> > Hi Tao,
> >
> > Thanks for raising this question! I thought about the existing
> quantization
> > workflow and whether it can be included with the AMP API. Although
> > quantization can be considered as mixed precision, there are differences.
> > For example, only a small number of operators can be quantized compared
> to
> > the operators that can run in FP16 precision. Thus, overriding the
> > operators to run in original dtype vs target dtype doesnt make much sense
> > for quantization.
> >
> > Also, quantization workflow may require a calibration dataset to
> calibrate
> > the min and max and calib_mode.
> > Arriving at a common API, for quantization with calibration and mixed
> > precision inference (FP16 and BF16) may make the API too complicated and
> > not very easy to use. I understand that this may cause some confusion as
> > people may try to use target_dtype of int8 but I think its still better
> > than causing user confusion with the API usage.
> >
> > Also, when we move quantize_model APIs outside contrib we can consider
> > adding them under AMP namespace. The challenge would then be to educate
> > users on difference between "quantize" and "convert".
> >
> > Anirudh
> >
> > On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A  wrote:
> >
> > > Thank you for the explanation. Sorry I didn't realize the proposal is
> for
> > > inference only.
> > >
> > > Then how do you think the amp_cast and amp_multicast in this proposal
> can
> > > work with the existing INT8 quantization workflow which I think should
> > also
> > > be considered as 'mixed precision'.
> > >
> > > -Original Message-
> > > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> > > Sent: Monday, April 29, 2019 10:25 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: Re: Proposal for Conversion from FP32 to Mixed Precision
> Models
> > >
> > > Hi Tao,
> > >
> > > The APIs proposed: "convert_model" and "convert_block" are mainly for
> > > inference use cases, where customers bring a FP32 model to convert it
> to
> > a
> > > mixed precision model to get improved performance while not losing out
> on
> > > the accuracy.
> > > The PR: https://github.com/apache/incubator-mxnet/pull/14173 is
> supposed
> > > to handle the training use cases and this proposal doesn't cover the
> AMP
> > > feature added in the PR. I think ptrendx@ and canoerst@ are better
> > > equipped to answer questions 1 and 2.
> > >
> > > > - more generally, what will be saved when users want to serialize
> > > > their
> > > model to disk?
> > >
> > > Lets say users want to save converted mixed precision model used for
> > > inference to disk. It will save both, the symbol with the amp_cast and
> > > amp_multicast operators and the params (which are casted if necessary).
> > >
> > > Anirudh
> > >
> > >
> > > On Mon, Apr 29, 2019 at 6:55 AM Lv, Tao A  wrote:
> > >
> > > > Thank you for sharing this, Anirudh.
> > > >
> > > > Curious to know:
> > > > - what will be saved in a training checkpoint or snapshot? Can it be
> > > > resumed on another platform which might not support the lower
> > > > precision the previous one used?
> > > > - what will be saved in the final symbol.json and params file w

Invitation to join the #mxnet slack channel

2019-04-29 Thread Nikhil Kulkarni
Hi,

I'd like to be a part of the MXNet Slack channel. Please send me an
invitation for the same.

Thanks for your time.

-- 
Regards,
  -Nikhil Kulkarni


Re: MXNet 1.4.1 Release Proposal

2019-04-29 Thread Junru Shao
Dear community,

We would love to follow up to remind that our release candidate 0 for
Apache MXNet 1.4.1 will be cut by the end of the day. We will open a voting
thread for this release tonight.

Thanks,
Junru

On Mon, Apr 8, 2019 at 6:57 PM Junru Shao  wrote:

> Thanks for the great opportunity! Let's wait for some time for fixes and
> proposals and decide the timeline then.
>
> On Mon, Apr 8, 2019 at 1:02 PM Hagay Lupesko  wrote:
>
>> Awesome - thanks Junru and Sheng!
>> I have updated the CWiki to reflect you being the release manager and
>> shepherd.
>>
>> Junru - I suggest we give the community a week more to add critical fix
>> proposals, before we set a timeline. Please feel free to drive this
>> forward, and I'm happy to help as needed.
>>
>> Thanks everyone,
>> Hagay
>>
>> On Thu, Apr 4, 2019 at 2:27 PM Sheng Zha  wrote:
>>
>> > Thanks Hagay for proposing the release and for Junru to volunteer to
>> drive
>> > the release. I will help Junru as the committer for this release.
>> >
>> > -sz
>> >
>> > On Thu, Apr 4, 2019 at 2:18 PM Junru Shao 
>> wrote:
>> >
>> > > Hi Hagay,
>> > >
>> > > I have some experiences in MXNet development, and would love to
>> volunteer
>> > > for driving this release.
>> > >
>> > > Thank you so much!
>> > >
>> > > Best,
>> > > Junru
>> > >
>> > > On Thu, Apr 4, 2019 at 1:51 PM Hagay Lupesko 
>> wrote:
>> > >
>> > > > Hello MXNet community,
>> > > >
>> > > > As previously discussed in [0
>> > > > <
>> > > >
>> > >
>> >
>> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
>> > > > >],
>> > > > and per the feedback from Pedro, Kellen and Sheng, I'd like to
>> propose
>> > > > releasing MXNet 1.4.1.
>> > > > MXNet 1.4.1 is a patch release on top of 1.4.0 (following semver[1
>> > > > ]), that includes backwards compatible bug
>> fixes
>> > -
>> > > a
>> > > > couple I am aware of are mem leaks in Scala API, Gluon RNN and
>> > NDArrays.
>> > > >
>> > > > I went ahead and created a draft release page on CWiki [2
>> > > > <
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
>> > > > >],
>> > > > thanks to Yuxi Hu for adding a mem leak fix, and thanks to Andrew
>> > Ayres,
>> > > > Qing Lan and Sergey Sokolov for fixing bugs in 1.4.0 - I went ahead
>> and
>> > > > added your fixes to the list.
>> > > >
>> > > > Asking the community to:
>> > > > (1) Any bug fix or regression you identified and fixed after 1.4.0
>> > > release?
>> > > > please add it to the release proposal wiki (or msg me on Slack if
>> you
>> > > don't
>> > > > have write access, happy to do it).
>> > > > (2) Any comments or suggestions on the release wiki? please leave
>> > > comments
>> > > > on the wiki or reply to this email.
>> > > > (3) I am looking for volunteers to drive the release - ideally we'll
>> > have
>> > > > two volunteers: a non-committer and a shepherd committer that can
>> also
>> > > help
>> > > > with the logistics that require permissions. This is a great way to
>> > > > contribute to the community and help MXNet!
>> > > >
>> > > > I plan to check-in in a few days and finalize the proposal, so
>> timely
>> > > > response is appreciated.
>> > > >
>> > > > Cheers,
>> > > > Hagay
>> > > >
>> > > > [0]
>> > > >
>> > > >
>> > >
>> >
>> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
>> > > > [1] https://semver.org/
>> > > > [2]
>> > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
>> > > >
>> > >
>> >
>>
>


Re: Invitation to join the #mxnet slack channel

2019-04-29 Thread Sheng Zha
Invite sent. Welcome!

-sz

> On Apr 29, 2019, at 5:17 PM, Nikhil Kulkarni  wrote:
> 
> Hi,
> 
> I'd like to be a part of the MXNet Slack channel. Please send me an
> invitation for the same.
> 
> Thanks for your time.
> 
> -- 
> Regards,
>  -Nikhil Kulkarni


RE: [RFC] Support for creation of Large Tensors in MXNet

2019-04-29 Thread Lv, Tao A
Thank you Lin! I would expect the current MKL-DNN implementation already 
supports the scenario you mentioned here. Can be verified by this issue: 
https://github.com/apache/incubator-mxnet/issues/13451

But as I said before, since we support flatten or reshape operators, so it's 
possible for users to convert a tensor with large element size to a tensor with 
large dimension size. It possibly will cause issue there.

To cover more cases, MKL-DNN is going to support INT64 dimension size in its 
coming 1.0 major release.

-tao

-Original Message-
From: Lin Yuan [mailto:apefor...@gmail.com] 
Sent: Tuesday, April 30, 2019 12:56 AM
To: dev@mxnet.incubator.apache.org
Subject: Re: [RFC] Support for creation of Large Tensors in MXNet

Tao,

- what's the max size of dimensionality? Which data type is used to define 
dimensionality (ndims)?
We assume the max size of dimensionality is relatively small. Hence `int` data 
type is used to define ndim

- what's the max size of each dimension? Which data type is used to define 
dimension size (shape[x])?
Currently, we assume the max size of each dimension is not going to exceed
2^31 in real applications. Hence the data type is `int32_t`

- what's the max size of total elements? Which data type is used to define 
element size (Prod(shape))?
We assume the total number of elements in a tensor can be larger than 2^32 in 
some applications such as deep graph library. We use the data type `int64_t` to 
represent the total element size. Currently due to performance regression in 
some operators (such as transpose), we used a compiler flag to set this data 
type to `int32_t` by default. Once we have ways to mitigate the performance 
regression, we will set the default data type to `int64_t`, which is part of 
the effort in this project that Rohit proposed.

What is the plan in MKLDNN to support large tensors? We may want to coordinate 
the progress since many operators are using MKLDNN implementation in CPU now.

Many Thanks,

Lin

On Sun, Apr 28, 2019 at 7:52 PM Lv, Tao A  wrote:

> Thank you for bringing this topic to dev, Rohit.
>
> Regarding large tensor, can you articulate:
> - what's the max size of dimensionality? Which data type is used to 
> define dimensionality (ndims)?
> - what's the max size of each dimension? Which data type is used to 
> define dimension size (shape[x])?
> - what's the max size of total elements? Which data type is used to 
> define element size (Prod(shape))?
>
> For me, any of these three can be *large*.
>
> -Original Message-
> From: Srivastava, Rohit Kumar 
> [mailto:srivastava@buckeyemail.osu.edu]
> Sent: Saturday, April 27, 2019 7:33 AM
> To: dev@mxnet.incubator.apache.org
> Subject: [RFC] Support for creation of Large Tensors in MXNet
>
> Dear Community,
>
> Currently MXNet supports creation of Tensors containing up to 2^32 
> elements. However there are cases where tensors of size over 5 billion 
> is required
>
> We plan to support creation of large tensors on MXNet. A design 
> proposal is ready for review:
> https://cwiki.apache.org/confluence/display/MXNET/Large+Tensor+Support
>
> We will appreciate any help and feedbacks from the community.
>
> Thank you!
>
> Rohit
>


[Announcement] New Committer - Hao Jin

2019-04-29 Thread Jun Wu
Please join me in welcoming Hao Jin (https://github.com/haojin2) from
AWS as a new committer.

Hao has designed and implemented many sophisticated algorithms for tensor
operations. His work has greatly expanded the coverage of MXNet operator
inventory and enhanced the performance of many operators that are hard to
be optimized. Not only that, Hao has been active in advocating MXNet
through providing high-quality translation service for quite a few
technical articles and blog posts.


[Announcement] New Committer - Zhennan Qin

2019-04-29 Thread Jun Wu
Please join me in welcoming Zhennan Qin (https://github.com/ZhennanQin) from
Intel as a new committer.

Zhennan is the main author of accelerating MXNet/MKLDNN inference through
operator fusion and model quantization. His work has placed MXNet in an
advantageous place for inference workloads on Intel CPUs compared with
other DL frameworks.


[VOTE] Release Apache MXNet (incubating) version 1.4.1.rc0

2019-04-29 Thread Junru Shao
Dear MXNet community,

This is the 3-day vote to release Apache MXNet (incubating) version v1.4.1.
The voting on dev@ list will start Apr 29 23:59:59 (PST) and close on May
02 23:59:59.

Below are links to
1) Release notes:
https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.1+Release+Notes
.
2) Release Candidate:
https://github.com/apache/incubator-mxnet/releases/tag/1.4.1.rc0.
3) Source and signatures on Apache dist server:
https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.4.1.rc0/.

Please remember to TEST first before voting accordingly:
+1 = approve
+0 = no opinion
-1 = disapprove (provide reason)

Best regards,
Junru Shao