Re: C++ api issue labeling

2018-06-22 Thread Lin Yuan
I agree with Hagay. Using "Backend" as label makes it much easier to track.
 "C++" label only describes the language used in implementation, "Backend"
better describes the nature of the work (let's assume we change the backend
implementation from C++ to other languages in the future).

Lin

On Fri, Jun 22, 2018 at 1:09 AM Hagay Lupesko  wrote:

> Thanks everyone for chiming in and clarifying.
> It seems that the "C++" label name is confusing for our community since it
> can be interpreted as both the CPP API and the backend...
> As an anecdote, this issue [1
> ] is labeled as
> "C++" but is about the CPP API, not the backend.
>
> Should we just rename "C++" to "Backend" to avoid confusion?
>
> [1] https://github.com/apache/incubator-mxnet/issues/10937
>
> On Thu, Jun 21, 2018 at 12:39 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> wrote:
>
> > Agree with Anirudh, they are different things. Maybe change the "C++"
> label
> > to "backend" would be more informative?
> >
> > On Thu, Jun 21, 2018 at 12:11 PM Anirudh  wrote:
> >
> > > Hi Hagay,
> > >
> > > I think we should keep these two labels seperate since they mean
> > different
> > > things.
> > > The C++ label refers to the issue for MXNet backend and the CPP package
> > > refers to the CPP language binding for mxnet.
> > > We can still make C++ API great again irrespective by filtering out CPP
> > > package issues :).
> > >
> > > Anirudh
> > >
> > >
> > > On Thu, Jun 21, 2018 at 11:56 AM, Hagay Lupesko 
> > wrote:
> > >
> > > > Hey community,
> > > >
> > > > I was going over the open GitHub issues for MXNet, and noticed that
> we
> > > have
> > > > two labels for the CPP API: "CPP package", "C++"
> > > >
> > > > Wanted to suggest we remove "CPP package" and just stick to "C++"
> > > > This will make it easier for the community to classify issues and
> focus
> > > on
> > > > making the C++ API great again ;)
> > > >
> > > > Let me know if someone has any concerns, otherwise I will find a
> > > committer
> > > > that I can work with to make this change.
> > > >
> > > > Thanks!
> > > > Hagay
> > > >
> > >
> >
>


Consolidating developer guide in one place (cwiki preferred)

2018-08-14 Thread Lin Yuan
Dear MXNet community,

As a developer, I noticed we have some developer guide scattered in
different websites (mxnet.io, cwiki):

E.g.

How to Create New Operators (Layers): [
https://mxnet.incubator.apache.org/faq/new_op.html]
A Guide to Implementing Sparse Operators in MXNet Backend [
https://cwiki.apache.org/confluence/display/MXNET/A+Guide+to+Implementing+Sparse+Operators+in+MXNet+Backend
]

When searching developer guide by keyword, only one of them can be returned
on either site.

It will be more convenient for developers if all the developer guide
resides on cwiki and all user guide (non-developer) on the mxnet.io
website. We can add a link on mxnet.io to refer all developers to cwiki for
guidance.

Any comment is appreciated.

Best Regards,

Lin


Enabling shared filter in JIRA

2018-08-14 Thread Lin Yuan
Dear MXNet Community,

As we are trying to create our Scrum board on JIRA, I noticed that we do
not have the permission to create shared filter, even as an administrator.
This has limited us to create scrum boards for different components of the
project.

I will really appreciate if someone in the Apache Infra team could help to
enable shared filter creation in this project.

Best Regards,

Lin


Re: Consolidating developer guide in one place (cwiki preferred)

2018-08-21 Thread Lin Yuan
Hi Aaron,

Thanks for your answer. I think it's a very worthwhile effort to move all
the developer related content from mxet.io website to a dedicated developer
site. Would you like to initiate this effort?

Best,

Lin

On Wed, Aug 15, 2018 at 3:47 PM Haibin Lin  wrote:

> +1
>
> On Wed, Aug 15, 2018 at 1:10 PM, Aaron Markham 
> wrote:
>
> > Hi Lin, I agree with this organization. If you feel like somethings
> should
> > be transitioned from the website to the wiki, I can help with that, but
> for
> > the moment I've been suggesting that new developer-focused content be
> > placed on the wiki.
> >
> > On Tue, Aug 14, 2018 at 10:40 AM, Lin Yuan  wrote:
> >
> > > Dear MXNet community,
> > >
> > > As a developer, I noticed we have some developer guide scattered in
> > > different websites (mxnet.io, cwiki):
> > >
> > > E.g.
> > >
> > > How to Create New Operators (Layers): [
> > > https://mxnet.incubator.apache.org/faq/new_op.html]
> > > A Guide to Implementing Sparse Operators in MXNet Backend [
> > > https://cwiki.apache.org/confluence/display/MXNET/A+
> > > Guide+to+Implementing+Sparse+Operators+in+MXNet+Backend
> > > ]
> > >
> > > When searching developer guide by keyword, only one of them can be
> > returned
> > > on either site.
> > >
> > > It will be more convenient for developers if all the developer guide
> > > resides on cwiki and all user guide (non-developer) on the mxnet.io
> > > website. We can add a link on mxnet.io to refer all developers to
> cwiki
> > > for
> > > guidance.
> > >
> > > Any comment is appreciated.
> > >
> > > Best Regards,
> > >
> > > Lin
> > >
> >
>


Re: Updating MXNet's Cub

2018-08-28 Thread Lin Yuan
+1

On Tue, Aug 28, 2018 at 12:39 AM Hagay Lupesko  wrote:

> Thanks for the feedback Chris. Will follow up.
>
> On Fri, Aug 24, 2018 at 10:53 AM Chris Olivier 
> wrote:
>
> > +1 for pointing to NVidia's repo for the newer Cub and subsequent
> versions.
> >
> > On Fri, Aug 24, 2018 at 10:01 AM Hagay Lupesko 
> wrote:
> >
> > > Hi all,
> > >
> > >
> > > One of MXNet’s submodule dependencies is a snapshot of Nvidia Cub (
> > > https://github.com/dmlc/cub) – the snapshot is of an older version of
> > Cub
> > > (1.7), while the latest Nvidia Cub release is 1.8.  Note that dmlc/cub
> > has
> > > no customizations of the source Cub repo.
> > >
> > >
> > > I’d like to suggest to update the existing Cub submodule to Nvidia’s
> Cub
> > > repo. Instead of the snapshot, MXNet will be using Nvidia’s repo and
> the
> > > latest release (both repos have the same BSD-3 license, so licensing
> > should
> > > not be an issue).
> > >
> > >
> > > Wanted to get feedback from the community to make sure I'm not missing
> > > anything.
> > >
> > > if there are no objections I'll submit a PR for the change.
> > >
> > >
> > > Cheers,
> > >
> > > Hagay
> > >
> >
>


Re: build from source instructions

2018-08-28 Thread Lin Yuan
Aaron,

I agree the installation page is very confusing to me. When I first tried
to build MXNet from source on MacOS, I was totally confused about the
instruction. Why was it vastly different from building from source on Linux
given these two OS have similar shell commands. I feel the automatic
scripts on MacOS platform is rather confusing than simplifying.

Lin

On Mon, Aug 27, 2018 at 9:21 PM Steffen Rochel 
wrote:

> Aaron - we should keep instructions how to build from source. Updating and
> re-organizing makes sense to me.
> Steffen
>
> On Mon, Aug 27, 2018 at 4:54 PM Aaron Markham 
> wrote:
>
> > Hello,
> > I was looking into the C++ instructions and came across this seemingly
> > pretty old page:
> > https://mxnet.incubator.apache.org/install/build_from_source
> >
> > I think it has several inaccuracies as different/updated installation
> info
> > has been added to different pages.
> >
> > Should it be deleted?
> >
> > Or should a specific build from source page be maintained (moving/copying
> > info from the other more recently updated pages)?
> >
> > I'm really thinking that it would be easier to maintain if each OS had
> its
> > own page, Python/pip info had its own page, then bindings had their own
> > pages.
> >
> > Other suggestions?
> >
> > Cheers,
> > Aaron
> >
>


Re: build from source instructions

2018-08-28 Thread Lin Yuan
When a user chooses to build from source, it is reasonable to infer that
they want to run the make process and install the python package
subsequently. The current automated build script is confusing in that I
really have no idea what I should do if I want to change some of the source
code in MXNet. Furthermore, building from source on MacOS should have the
same/similar process as in building from source on Linux since they have
the same shell environment. Having two different build instructions on
MacOS just adds additional confusion.



On Tue, Aug 28, 2018 at 10:44 AM Bhavin Thaker 
wrote:

> The automated build script on macOS was written with the intention to have
> an automated, easy and quick way to build and install MXNet by any user,
> new-bie or advanced. The build script aims to provide repeatability and an
> easy way to test the build instructions.
>
> Without the script, the build instructions had many combinations of
> possibilities which would break for various users and there was no easy way
> to test all the combinations.
>
> I propose that we have both well-written build instructions with
> corresponding automated build script to ensure that the build instructions
> are well-tested.
>
> Please remember that there can be multiple use-cases and user preferences
> to build MXNet.
>
> Bhavin Thaker.
>
> On Tue, Aug 28, 2018 at 10:29 AM Afrooze, Sina  wrote:
>
> > +1 on fully automated scripts being more confusing than helpful. It's
> > difficult to debug any issues when the entire instruction is to run a
> > single script. - Sina
> >
> >
> >
> > On 8/28/18, 9:46 AM, "Lin Yuan"  wrote:
> >
> > Aaron,
> >
> > I agree the installation page is very confusing to me. When I first
> > tried
> > to build MXNet from source on MacOS, I was totally confused about the
> > instruction. Why was it vastly different from building from source on
> > Linux
> > given these two OS have similar shell commands. I feel the automatic
> > scripts on MacOS platform is rather confusing than simplifying.
> >
> > Lin
> >
> > On Mon, Aug 27, 2018 at 9:21 PM Steffen Rochel <
> > steffenroc...@gmail.com>
> > wrote:
> >
> > > Aaron - we should keep instructions how to build from source.
> > Updating and
> > > re-organizing makes sense to me.
> > > Steffen
> > >
> > > On Mon, Aug 27, 2018 at 4:54 PM Aaron Markham <
> > aaron.s.mark...@gmail.com>
> > > wrote:
> > >
> > > > Hello,
> > > > I was looking into the C++ instructions and came across this
> > seemingly
> > > > pretty old page:
> > > > https://mxnet.incubator.apache.org/install/build_from_source
> > > >
> > > > I think it has several inaccuracies as different/updated
> > installation
> > > info
> > > > has been added to different pages.
> > > >
> > > > Should it be deleted?
> > > >
> > > > Or should a specific build from source page be maintained
> > (moving/copying
> > > > info from the other more recently updated pages)?
> > > >
> > > > I'm really thinking that it would be easier to maintain if each
> OS
> > had
> > > its
> > > > own page, Python/pip info had its own page, then bindings had
> > their own
> > > > pages.
> > > >
> > > > Other suggestions?
> > > >
> > > > Cheers,
> > > > Aaron
> > > >
> > >
> >
> >
> >
> >
>


Propose to discontinue supporting Apache MXNet on Windows 7

2018-08-28 Thread Lin Yuan
Dear Community,



Currently, our MXNet installation guide for Windows does not work for
Windows 7. e.g. Microsoft Visual Studio 2015 is not supported on Windows 7
.
In addition, MSFT ended “Mainstream” support for Windows 7 in 2015 (
https://support.microsoft.com/en-us/help/13853/windows-lifecycle-fact-sheet).
Therefore, it is not possible for developers to build MXNet and verify the
fix on Windows 7 platform. Given that there have been several issues about
MXNet error on Windows 7 (issue#9271
, issue #8921
, issue #11163
), it will even add
more burden on developers in the future if we were to continue supporting
Windows 7.



I therefore would like to propose that we discontinue the support of MXNet
on Windows 7 in the next release.


Specifically, this means the following required actions:

1) state the discontinuation of Windows 7 support in the release note

2) update the MXNet webpage if Windows version is mentioned.

3) update the open Github issues related to Windows 7


Please share your thoughts about this proposal and/or suggest if there is
any other missing action item from the above.


Best Regards,


Lin


Re: MXNet developer setup on Mac with VSCode for develop, test and debug

2018-07-20 Thread Lin Yuan
Pedro, I have tried CLion briefly but found it could not resolve C++ tags
by default and throwing many errors/warning. Did you set any thing extra?
Besides, debugging in a mixed language (Python and C++) environment, you
don't have to switch between IDEs in VSCode.

On Thu, Jul 19, 2018 at 9:59 AM Pedro Larroy 
wrote:

> Have you guys tried CLion, works like a charm for me. (Requires license).
>
> On Wed, Jul 18, 2018 at 10:09 PM Naveen Swamy  wrote:
>
> > Thanks Sandeep for putting this together, it would make it easy for
> people
> > who prefer to IDEs to get started with MXNet easily.
> >
> > On Wed, Jul 18, 2018 at 1:04 PM, Lin Yuan  wrote:
> >
> > > Hi Aaron,
> > >
> > > This doc is for development on Mac. It is not intended for Windows
> users.
> > > Maybe we can start a different thread to discuss about MXNet build on
> > > Windows? I have tried it myself on a GPU instances built on Windows
> DLAMI
> > > 10.0. I would love to share with you my setup steps.
> > >
> > > Lin
> > >
> > > On Wed, Jul 18, 2018 at 11:43 AM Markham, Aaron
> > > 
> > > wrote:
> > >
> > > > This is tangential, but Lin, I noticed during the RC1 tests you said
> > you
> > > > tried it out on Windows and it worked for you. I'd like to get VS2017
> > or
> > > VS
> > > > Code working, take Sandeep's setup content and possibly your Windows
> > > > experience, and improve the MXNet Windows setup guide. I've tried it
> > and
> > > > failed. Multiple times. I also tried the MKLDNN instructions and
> > failed.
> > > I
> > > > tried the setup tools batch file and was hit with a lot of dependency
> > > > errors. Some of the problem isn't in the MXNet docs, but in the
> > > > dependencies' documentation, but I'm left to go figure that out on my
> > > own.
> > > > Anyway, any help you can provide here would be great. Also, if any of
> > you
> > > > reading this has a sort of checklist or guide for Windows, I'd love
> to
> > > see
> > > > it.
> > > >
> > > > BTW, I'm using Windows 10 with an NVIDIA GeForce GTX 980, and was
> > trying
> > > > to use VS2017 Community Edition and MKL. I went to MKL after OpenBLAS
> > > > wasn't installing/building.
> > > >
> > > > On 7/18/18, 10:59 AM, "Lin Yuan"  wrote:
> > > >
> > > > Thanks for the well-written document! As a new MXNet developer, I
> > > have
> > > > found it very helpful.
> > > >
> > > > Lin
> > > >
> > > > On Wed, Jul 18, 2018 at 10:50 AM sandeep krishnamurthy <
> > > s...@apache.org
> > > > >
> > > > wrote:
> > > >
> > > > > Hello Community,
> > > > >
> > > > >
> > > > >
> > > > > As a MXNet contributor, I had issues and took me some time on
> > > getting
> > > > > hands-on with MXNet codebase, being able to code, test, DEBUG
> > > > python/CPP
> > > > > combination. I have documented the steps for MXNet development
> > > setup
> > > > using
> > > > > VSCode on Mac. Document starts from installing all required
> > > > > tools/packages/IDEs/extensions and then provides steps for
> > > debugging
> > > > mix of
> > > > > Python/CPP code, which is most likely the case for any MXNet
> > > > developer, all
> > > > > in single IDE window. By end of this document, anyone should be
> > > able
> > > > to
> > > > > walk through the MXNet code, debug and be able to make first
> code
> > > > change.
> > > > >
> > > > >
> > > > >
> > > > > Please feel free to add comments, make changes as necessary.
> > > > >
> > > > >
> > > > >
> > > > https://cwiki.apache.org/confluence/display/MXNET/
> > > MXNet+Developer+Setup+on+Mac
> > > > >
> > > > > Best,
> > > > > Sandeep
> > > > >
> > > >
> > > >
> > > >
> > >
> >
>


Re: [VOTE] Subscribe dev@ to Github Activities

2018-07-17 Thread Lin Yuan
+1, I think they are very relevant to dev and as Aaron said we can always
set up personalized filter.

On Tue, Jul 17, 2018 at 9:21 AM Aaron Markham 
wrote:

> +1, I don't read your emails anyways. Just kidding. I think it would be
> good to see the action, even if I eventually have to setup filters if it
> gets overwhelming.
>
> On Tue, Jul 17, 2018 at 9:15 AM, Tianqi Chen 
> wrote:
>
> > +1, most of issue and PR activities are about development, and they
> belong
> > to dev. It also helps us to recognizes contributors who are actively
> > contributing but less vocal via emails -- there are many of them.
> >
> > Tianqi
> >
> > On Tue, Jul 17, 2018 at 8:47 AM, Anirudh  wrote:
> >
> > > -1
> > >
> > > The low signal to noise ratio would mean that we may miss important
> > emails.
> > > Even with the different filters that we may setup for dev@, the emails
> > > would be too many to not miss the important ones. We would see more and
> > > more people starting a design discussion on an issue or PR. Because of
> > the
> > > low signal to noise ratio on the dev@ list, many may miss these
> > > discussions.
> > >
> > > Slowly, this would erode the purpose of the dev@ list as this means
> that
> > > you don't really have to do anything explicitly on the dev@ list.
> > > You can start a design discussion on a github issue. You can start a
> > > vote/discussion on a github issue.
> > >
> > > Anirudh
> > >
> > > On Mon, Jul 16, 2018 at 4:35 AM, Timur Shenkao 
> > wrote:
> > >
> > > > +1 if my vote can be taken into account
> > > >
> > > > On Mon, Jul 16, 2018 at 4:32 AM, Sheng Zha 
> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm starting a vote on subscribing dev@ to Github activities. See
> > > > previous
> > > > > discussion thread here
> > > > > <
> https://lists.apache.org/thread.html/3d883f6a3cbc8e81e810962e0c0fe7
> > > > > bfd01f0b78d3cb44034f566442@%3Cdev.mxnet.apache.org%3E>
> > > > > .
> > > > >
> > > > > The vote lasts for three days and ends on 7/18/2018 at 9pm pst.
> > > > >
> > > > > -sz
> > > > >
> > > >
> > >
> >
>


Re: MXNet developer setup on Mac with VSCode for develop, test and debug

2018-07-18 Thread Lin Yuan
Thanks for the well-written document! As a new MXNet developer, I have
found it very helpful.

Lin

On Wed, Jul 18, 2018 at 10:50 AM sandeep krishnamurthy 
wrote:

> Hello Community,
>
>
>
> As a MXNet contributor, I had issues and took me some time on getting
> hands-on with MXNet codebase, being able to code, test, DEBUG python/CPP
> combination. I have documented the steps for MXNet development setup using
> VSCode on Mac. Document starts from installing all required
> tools/packages/IDEs/extensions and then provides steps for debugging mix of
> Python/CPP code, which is most likely the case for any MXNet developer, all
> in single IDE window. By end of this document, anyone should be able to
> walk through the MXNet code, debug and be able to make first code change.
>
>
>
> Please feel free to add comments, make changes as necessary.
>
>
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Developer+Setup+on+Mac
>
> Best,
> Sandeep
>


Re: MXNet developer setup on Mac with VSCode for develop, test and debug

2018-07-18 Thread Lin Yuan
Hi Aaron,

This doc is for development on Mac. It is not intended for Windows users.
Maybe we can start a different thread to discuss about MXNet build on
Windows? I have tried it myself on a GPU instances built on Windows DLAMI
10.0. I would love to share with you my setup steps.

Lin

On Wed, Jul 18, 2018 at 11:43 AM Markham, Aaron 
wrote:

> This is tangential, but Lin, I noticed during the RC1 tests you said you
> tried it out on Windows and it worked for you. I'd like to get VS2017 or VS
> Code working, take Sandeep's setup content and possibly your Windows
> experience, and improve the MXNet Windows setup guide. I've tried it and
> failed. Multiple times. I also tried the MKLDNN instructions and failed. I
> tried the setup tools batch file and was hit with a lot of dependency
> errors. Some of the problem isn't in the MXNet docs, but in the
> dependencies' documentation, but I'm left to go figure that out on my own.
> Anyway, any help you can provide here would be great. Also, if any of you
> reading this has a sort of checklist or guide for Windows, I'd love to see
> it.
>
> BTW, I'm using Windows 10 with an NVIDIA GeForce GTX 980, and was trying
> to use VS2017 Community Edition and MKL. I went to MKL after OpenBLAS
> wasn't installing/building.
>
> On 7/18/18, 10:59 AM, "Lin Yuan"  wrote:
>
> Thanks for the well-written document! As a new MXNet developer, I have
> found it very helpful.
>
> Lin
>
> On Wed, Jul 18, 2018 at 10:50 AM sandeep krishnamurthy  >
> wrote:
>
> > Hello Community,
> >
> >
> >
> > As a MXNet contributor, I had issues and took me some time on getting
> > hands-on with MXNet codebase, being able to code, test, DEBUG
> python/CPP
> > combination. I have documented the steps for MXNet development setup
> using
> > VSCode on Mac. Document starts from installing all required
> > tools/packages/IDEs/extensions and then provides steps for debugging
> mix of
> > Python/CPP code, which is most likely the case for any MXNet
> developer, all
> > in single IDE window. By end of this document, anyone should be able
> to
> > walk through the MXNet code, debug and be able to make first code
> change.
> >
> >
> >
> > Please feel free to add comments, make changes as necessary.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Developer+Setup+on+Mac
> >
> > Best,
> > Sandeep
> >
>
>
>


Re: [LAZY VOTE] Consolidating developer guide in one place (cwiki preferred)

2018-09-04 Thread Lin Yuan
+1

On Tue, Sep 4, 2018 at 1:46 PM Aaron Markham 
wrote:

> I'd like to call for a lazy vote on this before proceeding. Already had
> some +1s but let's be sure.
>
> The vote is to move developer guide info to cwiki. User guides would remain
> on the website.
>
> On Tue, Aug 21, 2018 at 12:53 PM sandeep krishnamurthy <
> sandeep.krishn...@gmail.com> wrote:
>
> > +1
> > Thanks Lin and Aaron. I agree website to cover all user facing
> > documentation and a separate consolidated and organized developer
> focussed
> > docs in one place (cwiki).
> >
> >
> > Note: Permissions on cwiki is currently not well managed with many people
> > having full admin rights to edit/create/delete pages. Should be fine for
> > now, but, when we start accumulating many documents and resources, we
> > should probably revisit on Delete permissions.
> >
> >
> > On Tue, Aug 21, 2018 at 11:57 AM Lin Yuan  wrote:
> >
> > > Hi Aaron,
> > >
> > > Thanks for your answer. I think it's a very worthwhile effort to move
> all
> > > the developer related content from mxet.io website to a dedicated
> > > developer
> > > site. Would you like to initiate this effort?
> > >
> > > Best,
> > >
> > > Lin
> > >
> > > On Wed, Aug 15, 2018 at 3:47 PM Haibin Lin 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > On Wed, Aug 15, 2018 at 1:10 PM, Aaron Markham <
> > > aaron.s.mark...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Lin, I agree with this organization. If you feel like somethings
> > > > should
> > > > > be transitioned from the website to the wiki, I can help with that,
> > but
> > > > for
> > > > > the moment I've been suggesting that new developer-focused content
> be
> > > > > placed on the wiki.
> > > > >
> > > > > On Tue, Aug 14, 2018 at 10:40 AM, Lin Yuan 
> > > wrote:
> > > > >
> > > > > > Dear MXNet community,
> > > > > >
> > > > > > As a developer, I noticed we have some developer guide scattered
> in
> > > > > > different websites (mxnet.io, cwiki):
> > > > > >
> > > > > > E.g.
> > > > > >
> > > > > > How to Create New Operators (Layers): [
> > > > > > https://mxnet.incubator.apache.org/faq/new_op.html]
> > > > > > A Guide to Implementing Sparse Operators in MXNet Backend [
> > > > > > https://cwiki.apache.org/confluence/display/MXNET/A+
> > > > > > Guide+to+Implementing+Sparse+Operators+in+MXNet+Backend
> > > > > > ]
> > > > > >
> > > > > > When searching developer guide by keyword, only one of them can
> be
> > > > > returned
> > > > > > on either site.
> > > > > >
> > > > > > It will be more convenient for developers if all the developer
> > guide
> > > > > > resides on cwiki and all user guide (non-developer) on the
> > mxnet.io
> > > > > > website. We can add a link on mxnet.io to refer all developers
> to
> > > > cwiki
> > > > > > for
> > > > > > guidance.
> > > > > >
> > > > > > Any comment is appreciated.
> > > > > >
> > > > > > Best Regards,
> > > > > >
> > > > > > Lin
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Sandeep Krishnamurthy
> >
>


Re: [DISCUSS] Build OSX builds in CI (possibly with TravisCI).

2018-09-05 Thread Lin Yuan
Hi Kellen,

Many thanks for your and Marco's effort! I think this is a very crucial
piece to improve MXNet stability.

To add some data points:
1) Customers using CoreML to MXNet converter were blocked for a while
because the converter was broken and no unit test was in place to detect
that.
2) Developers on Mac cannot verify their local commits because some unit
tests on master were broken. This wasted much time and resource on jenkins
server to detect the failure.
3) Please consider running the CI on Mac OS 10.13 since this is the minimum
Mac OS version that supports CoreML (to support CoreML to MXNet converter)

Best Regards,

Lin

On Wed, Sep 5, 2018, 3:02 AM kellen sunderland 
wrote:

> I'm bumping this thread as we've recently had our first serious bug on
> MacOS that would have been caught by enabling Travis.
>
> I'm going to do a little experimental work together with Marco with the
> goal of enabling a minimal Travis build that will run python tests.  So far
> I've verified that Travis will in fact find a bug that currently exists in
> master and has been reproduced by MacOS clients.  This indicates to me that
> adding Travis will add value to our CI.
>
> My best guess is that it might take us some iteration before we find a
> scalable way to integrate Travis.  Given this we're going to enable Travis
> in non-blocking mode (i.e. failures are safe to ignore for the time being).
>
> To help mitigate the risk of timeouts, and to remove legacy code I'm going
> to re-create the travis.yml file from scratch.  I think it'll be much less
> confusing if we only have working code related to Travis in our codebase,
> so that contributors won't have to experiment to see what is or isn't
> working.  We've got some great, but slightly out-of-date functionality in
> the legacy .travis.yml file.  I hope we can work together to update the
> legacy features, ensure they work with the current folder structure and
> also make sure the features run within Travis's 45 minute global time
> window.
>
> I'd also like to set expectations that this is strictly a volunteer
> effort.  I'd welcome help from the community for support and maintenance.
> The model downloading caching work particularly stands out to me as
> something I'd like to re-enable again as soon as possible.
>
> -Kellen
>
> On Tue, Jan 9, 2018 at 11:52 AM Marco de Abreu <
> marco.g.ab...@googlemail.com>
> wrote:
>
> > Looks good! +1
> >
> > On Tue, Jan 9, 2018 at 10:24 AM, kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > I think most were in favour of at a minimum creating a clang build so
> > I've
> > > created a PR
> > > https://github.com/apache/incubator-mxnet/pull/9330/commits/
> > > 84089ea14123ebe4d66cc92e82a2d529cfbd8b19.
> > > My hope is this will catch many of the issues blocking OSX builds.  In
> > fact
> > > it already caught one issue.  If you guys are in favour I can remove
> the
> > > WIP and ask that it be merged.
> > >
> > > On Thu, Jan 4, 2018 at 6:29 PM, Chris Olivier 
> > > wrote:
> > >
> > > > Nope, I have been on vacation.
> > > >
> > > > On Thu, Jan 4, 2018 at 9:10 AM, kellen sunderland <
> > > > kellen.sunderl...@gmail.com> wrote:
> > > >
> > > > > Hope everyone had a good break.  Just wanted to check if there were
> > > > further
> > > > > thoughts on OSX builds.  Chris, did you have time to look into
> > > > virtualizing
> > > > > Mac OS?  Would it make sense for us to put something in place in
> the
> > > > > interim e.g. the clang solution?
> > > > >
> > > > > On Tue, Dec 12, 2017 at 7:59 PM, de Abreu, Marco <
> mab...@amazon.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks for looking into this, Chris! No hurries on that one,
> we’ll
> > > look
> > > > > > into it next stage when we add new system- and
> build-configurations
> > > to
> > > > > the
> > > > > > CI.
> > > > > >
> > > > > > On 12.12.17, 19:12, "Chris Olivier" 
> wrote:
> > > > > >
> > > > > > I am on vacation starting Thursday.
> > > > > >
> > > > > > On Tue, Dec 12, 2017 at 9:49 AM kellen sunderland <
> > > > > > kellen.sunderl...@gmail.com> wrote:
> > > > > >
> > > > > > > Absolutely, let's do an investigation and see if it's
> > possible
> > > to
> > > > > > > virtualize.  Would you have time to look into it a bit
> > further?
> > > > > > >
> > > > > > > On Tue, Dec 12, 2017 at 6:47 PM, Chris Olivier <
> > > > > > cjolivie...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Don’t get me wrong, I’m not saying this Mac OS Jenkins
> > > solution
> > > > > is
> > > > > > doable
> > > > > > > > but I feel like we should investigate because the payoff
> > > would
> > > > be
> > > > > > large.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Dec 12, 2017 at 9:38 AM Chris Olivier <
> > > > > > cjolivie...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Apple’s Darwin OS Is recently open-sourced.
> > > > > > > > > 

Re: [DISCUSS] Build OSX builds in CI (possibly with TravisCI).

2018-09-05 Thread Lin Yuan
Hi Kellen,

I would love to contribute. Please let me know if you have any particular
work item that I can help.

Best,

Lin

On Wed, Sep 5, 2018 at 9:51 AM Tianqi Chen  wrote:

> is it possible for us to get a MacBook and hook it to the current Jenkins
> CI? Travis OSX usually build from scratch and that was pretty slow
>
> Tianqi
>
>
> On Wed, Sep 5, 2018 at 8:49 AM kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
>
> > Great you feel that way Lin, please feel free to contribute if you have
> any
> > features you'd like tested.  We are using the travis image xcode9.4 which
> > is based on MacOS 10.13.
> >
> > On Wed, Sep 5, 2018 at 6:40 PM Lin Yuan  wrote:
> >
> > > Hi Kellen,
> > >
> > > Many thanks for your and Marco's effort! I think this is a very crucial
> > > piece to improve MXNet stability.
> > >
> > > To add some data points:
> > > 1) Customers using CoreML to MXNet converter were blocked for a while
> > > because the converter was broken and no unit test was in place to
> detect
> > > that.
> > > 2) Developers on Mac cannot verify their local commits because some
> unit
> > > tests on master were broken. This wasted much time and resource on
> > jenkins
> > > server to detect the failure.
> > > 3) Please consider running the CI on Mac OS 10.13 since this is the
> > minimum
> > > Mac OS version that supports CoreML (to support CoreML to MXNet
> > converter)
> > >
> > > Best Regards,
> > >
> > > Lin
> > >
> > > On Wed, Sep 5, 2018, 3:02 AM kellen sunderland <
> > > kellen.sunderl...@gmail.com>
> > > wrote:
> > >
> > > > I'm bumping this thread as we've recently had our first serious bug
> on
> > > > MacOS that would have been caught by enabling Travis.
> > > >
> > > > I'm going to do a little experimental work together with Marco with
> the
> > > > goal of enabling a minimal Travis build that will run python tests.
> So
> > > far
> > > > I've verified that Travis will in fact find a bug that currently
> exists
> > > in
> > > > master and has been reproduced by MacOS clients.  This indicates to
> me
> > > that
> > > > adding Travis will add value to our CI.
> > > >
> > > > My best guess is that it might take us some iteration before we find
> a
> > > > scalable way to integrate Travis.  Given this we're going to enable
> > > Travis
> > > > in non-blocking mode (i.e. failures are safe to ignore for the time
> > > being).
> > > >
> > > > To help mitigate the risk of timeouts, and to remove legacy code I'm
> > > going
> > > > to re-create the travis.yml file from scratch.  I think it'll be much
> > > less
> > > > confusing if we only have working code related to Travis in our
> > codebase,
> > > > so that contributors won't have to experiment to see what is or isn't
> > > > working.  We've got some great, but slightly out-of-date
> functionality
> > in
> > > > the legacy .travis.yml file.  I hope we can work together to update
> the
> > > > legacy features, ensure they work with the current folder structure
> and
> > > > also make sure the features run within Travis's 45 minute global time
> > > > window.
> > > >
> > > > I'd also like to set expectations that this is strictly a volunteer
> > > > effort.  I'd welcome help from the community for support and
> > maintenance.
> > > > The model downloading caching work particularly stands out to me as
> > > > something I'd like to re-enable again as soon as possible.
> > > >
> > > > -Kellen
> > > >
> > > > On Tue, Jan 9, 2018 at 11:52 AM Marco de Abreu <
> > > > marco.g.ab...@googlemail.com>
> > > > wrote:
> > > >
> > > > > Looks good! +1
> > > > >
> > > > > On Tue, Jan 9, 2018 at 10:24 AM, kellen sunderland <
> > > > > kellen.sunderl...@gmail.com> wrote:
> > > > >
> > > > > > I think most were in favour of at a minimum creating a clang
> build
> > so
> > > > > I've
> > > > > > created a PR
> > > > > > https://github.com/apache/incubator-mxnet/pull/9330/commits/
> > > > > > 84089ea14123ebe4d66cc92e82a2d529cfbd8b19.
> > > > > > My hope i

Re: Propose to discontinue supporting Apache MXNet on Windows 7

2018-08-30 Thread Lin Yuan
Hi Hagay,

To minimize the impact to existing customers, I suggest we support MXNe 1.3
installation on Windows but defer all the bug fix to Windows 10 version and
ask users to migrate. Then we officially discontinue the support on Windows
7 in the 1.4 release.

Thanks,

Lin

On Wed, Aug 29, 2018 at 1:29 PM Hagay Lupesko  wrote:

> +1 (non-binding)
> Thanks for raising this Lin!
> Are you suggesting to do it as part of MXNet 1.3?
>
> On Wed, Aug 29, 2018 at 9:14 AM Srivastava, Rohit Kumar <
> srivastava@buckeyemail.osu.edu> wrote:
>
> > +1
> >
> > On 8/29/18, 8:39 AM, "sandeep krishnamurthy" <
> sandeep.krishn...@gmail.com>
> > wrote:
> >
> > +1 Thanks for bringing this up.
> >
> > On Wed, Aug 29, 2018 at 6:38 AM Marco de Abreu
> >  wrote:
> >
> > > +1
> > >
> > > On Wed, Aug 29, 2018 at 1:08 PM kellen sunderland <
> > > kellen.sunderl...@gmail.com> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Wed, Aug 29, 2018, 1:18 AM Anirudh Acharya <
> > anirudhk...@gmail.com>
> > > > wrote:
> > > >
> > > > > +1 for discontinuing.
> > > > >
> > > > > On Tue, Aug 28, 2018 at 4:11 PM Naveen Swamy <
> mnnav...@gmail.com
> > >
> > > wrote:
> > > > >
> > > > > > +1 to stop supporting Win7
> > > > > >
> > > > > > On Tue, Aug 28, 2018 at 3:54 PM Lin Yuan <
> apefor...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Dear Community,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Currently, our MXNet installation guide for Windows does
> not
> > work
> > > for
> > > > > > > Windows 7. e.g. Microsoft Visual Studio 2015 is not
> > supported on
> > > > > Windows
> > > > > > 7
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://visualstudio.microsoft.com/vs/support/vs2015/received-error-specified-program-requires-newer-version-windows/
> > > > > > > >.
> > > > > > > In addition, MSFT ended “Mainstream” support for Windows 7
> > in 2015
> > > (
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://support.microsoft.com/en-us/help/13853/windows-lifecycle-fact-sheet
> > > > > > > ).
> > > > > > > Therefore, it is not possible for developers to build MXNet
> > and
> > > > verify
> > > > > > the
> > > > > > > fix on Windows 7 platform. Given that there have been
> several
> > > issues
> > > > > > about
> > > > > > > MXNet error on Windows 7 (issue#9271
> > > > > > > <https://github.com/apache/incubator-mxnet/issues/9271>,
> > issue
> > > #8921
> > > > > > > <https://github.com/apache/incubator-mxnet/issues/8921>,
> > issue
> > > > #11163
> > > > > > > <https://github.com/apache/incubator-mxnet/issues/11163>),
> > it will
> > > > > even
> > > > > > > add
> > > > > > > more burden on developers in the future if we were to
> > continue
> > > > > supporting
> > > > > > > Windows 7.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I therefore would like to propose that we discontinue the
> > support
> > > of
> > > > > > MXNet
> > > > > > > on Windows 7 in the next release.
> > > > > > >
> > > > > > >
> > > > > > > Specifically, this means the following required actions:
> > > > > > >
> > > > > > > 1) state the discontinuation of Windows 7 support in the
> > release
> > > note
> > > > > > >
> > > > > > > 2) update the MXNet webpage if Windows version is
> mentioned.
> > > > > > >
> > > > > > > 3) update the open Github issues related to Windows 7
> > > > > > >
> > > > > > >
> > > > > > > Please share your thoughts about this proposal and/or
> > suggest if
> > > > there
> > > > > is
> > > > > > > any other missing action item from the above.
> > > > > > >
> > > > > > >
> > > > > > > Best Regards,
> > > > > > >
> > > > > > >
> > > > > > > Lin
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Sandeep Krishnamurthy
> >
> >
> >
>


Re: Propose to discontinue supporting Apache MXNet on Windows 7

2018-08-30 Thread Lin Yuan
Hi Sheng,

Thanks for raising this concern. The problem now is that we cannot even
build MXNet on Windows 7 because the build process requires MS VS 2015 w/
update 3 which is incompatible on Windows 7. This leaves many Windows 7
related open issues on github without any timely response. In my opinion,
having no response to users' request is probably even worse than letting
them know the limitation of OS support.

To minimize the impact to current Windows 7 users, we can provide PyPi
package for Windows 7 in this release but defer the bug fix and feature
enhancement to later Windows OS version. Based on users' feedbacks, we can
then officially discontinue the Windows 7 support in the next MXNet
release.

I will appreciate your comments.

Lin



On Wed, Aug 29, 2018 at 1:37 PM Sheng Zha  wrote:

> Are any of the votes based on any measure of user impact, if we indeed
> decide not to fix the current problems?
>
> -sz
>
> On Wed, Aug 29, 2018 at 1:29 PM Hagay Lupesko  wrote:
>
> > +1 (non-binding)
> > Thanks for raising this Lin!
> > Are you suggesting to do it as part of MXNet 1.3?
> >
> > On Wed, Aug 29, 2018 at 9:14 AM Srivastava, Rohit Kumar <
> > srivastava@buckeyemail.osu.edu> wrote:
> >
> > > +1
> > >
> > > On 8/29/18, 8:39 AM, "sandeep krishnamurthy" <
> > sandeep.krishn...@gmail.com>
> > > wrote:
> > >
> > > +1 Thanks for bringing this up.
> > >
> > > On Wed, Aug 29, 2018 at 6:38 AM Marco de Abreu
> > >  wrote:
> > >
> > > > +1
> > > >
> > > > On Wed, Aug 29, 2018 at 1:08 PM kellen sunderland <
> > > > kellen.sunderl...@gmail.com> wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > On Wed, Aug 29, 2018, 1:18 AM Anirudh Acharya <
> > > anirudhk...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > +1 for discontinuing.
> > > > > >
> > > > > > On Tue, Aug 28, 2018 at 4:11 PM Naveen Swamy <
> > mnnav...@gmail.com
> > > >
> > > > wrote:
> > > > > >
> > > > > > > +1 to stop supporting Win7
> > > > > > >
> > > > > > > On Tue, Aug 28, 2018 at 3:54 PM Lin Yuan <
> > apefor...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > > Dear Community,
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Currently, our MXNet installation guide for Windows does
> > not
> > > work
> > > > for
> > > > > > > > Windows 7. e.g. Microsoft Visual Studio 2015 is not
> > > supported on
> > > > > > Windows
> > > > > > > 7
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://visualstudio.microsoft.com/vs/support/vs2015/received-error-specified-program-requires-newer-version-windows/
> > > > > > > > >.
> > > > > > > > In addition, MSFT ended “Mainstream” support for Windows
> 7
> > > in 2015
> > > > (
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://support.microsoft.com/en-us/help/13853/windows-lifecycle-fact-sheet
> > > > > > > > ).
> > > > > > > > Therefore, it is not possible for developers to build
> MXNet
> > > and
> > > > > verify
> > > > > > > the
> > > > > > > > fix on Windows 7 platform. Given that there have been
> > several
> > > > issues
> > > > > > > about
> > > > > > > > MXNet error on Windows 7 (issue#9271
> > > > > > > > <https://github.com/apache/incubator-mxnet/issues/9271>,
> > > issue
> > > > #8921
> > > > > > > > <https://github.com/apache/incubator-mxnet/issues/8921>,
> > > issue
> > &

Re: [VOTE] Release MXNet version 1.2.1.RC1

2018-07-12 Thread Lin Yuan
+1 Built on Windows server. Ran unittests. Works as expected.

On Thu, Jul 12, 2018 at 8:01 AM sandeep krishnamurthy <
sandeep.krishn...@gmail.com> wrote:

> +1
>
> Built from source. Tested on CPU and GPU with Keras-MXNet (ResNet and LSTM
> examples), works as expected.
>
> Best,
> Sandeep
>
> On Thu, Jul 12, 2018 at 7:10 AM Indhu  wrote:
>
> > +1
> >
> > Built from source. Tested few RNN samples on GPU. Works fine.
> >
> > On Thu, Jul 12, 2018, 12:01 AM Haibin Lin 
> > wrote:
> >
> > > +1
> > > Built from source with cuda and dist kvstore. Ran dist_sync_kvstore.py
> > > nightly test and it passed.
> > >
> > > Best,
> > > Haibin
> > >
> > > On Wed, Jul 11, 2018 at 6:13 PM, Roshani Nagmote <
> > > roshaninagmo...@gmail.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Could you please test and vote for this release? Voting will end
> > tomorrow
> > > > by 5:50 pm PDT.
> > > >
> > > > Thanks,
> > > > Roshani
> > > >
> > > > On Mon, Jul 9, 2018 at 4:53 PM Roshani Nagmote <
> > > roshaninagmo...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I would like to propose a vote to release Apache MXNet (incubating)
> > > > > version
> > > > > 1.2.1.RC1. Voting will start now (Monday, Jul 9th) and end at 5:50
> PM
> > > > > PDT, Thursday, July 12th.
> > > > >
> > > > > Link to release candidate 1.2.1.rc1:
> > > > > *https://github.com/apache/incubator-mxnet/releases/tag/1.2.1.rc1
> > > > >  >*
> > > > >
> > > > > View this page, click on "Build from Source", and use the source
> code
> > > > > obtained from 1.2.1.rc1 tag:
> > > > > https://mxnet.incubator.apache.org/install/index.html
> > > > >
> > > > > (Note: The README.md points to the 1.2.1 tag and does not work at
> the
> > > > > moment.)
> > > > >
> > > > > Please remember to test first before voting accordingly:
> > > > >
> > > > > +1 = approve
> > > > > +0 = no opinion
> > > > > -1 = disapprove (provide reason)
> > > > >
> > > > > Thanks,
> > > > > Roshani
> > > > >
> > > >
> > >
> >
>
>
> --
> Sandeep Krishnamurthy
>


Re: [DISCUSS] Subscribe dev@ to Github Activities?

2018-07-12 Thread Lin Yuan
+1

On Thu, Jul 12, 2018 at 12:26 PM Anirudh Acharya 
wrote:

> +1
>
> On Thu, Jul 12, 2018 at 11:51 AM Piyush Ghai 
> wrote:
>
> > +1
> > > On Jul 12, 2018, at 11:50 AM, Tianqi Chen 
> > wrote:
> > >
> > > +1
> > >
> > > On Thu, Jul 12, 2018 at 11:10 AM, Sheng Zha 
> wrote:
> > >
> > >> Hi all,
> > >>
> > >> Should we subscribe dev list to github updates on mxnet repo? Both
> > github
> > >> issues/PRs and the dev list are intended for technical discussions and
> > in
> > >> that aspect largely share the same goal. Since MXNet has most activity
> > >> github, this could help dev@ to become more active. Some pros and
> cons:
> > >>
> > >> Pros:
> > >> - There have been many high quality discussions that happen on github
> to
> > >> which the dev list can benefit.
> > >> - Replies on update emails are reflected on the specific issue/PR.
> > >> - Users can also choose to click on the link and go to github to
> > >> participate in discussion.
> > >> - We still have the ability to carry out dev@ only conversation.
> > >>
> > >> Cons:
> > >> - Higher volume on dev list.
> > >> - Some discussions might not be suitable for dev@. (though I can't
> > think
> > >> of
> > >> why such conversation should happen on github either)
> > >>
> > >> -sz
> > >>
> >
> >
>


Re: Horovod-MXNet Integration

2018-11-02 Thread Lin Yuan
Hi Mu,

Darren (@yuxihu ) and I have been working on
releasing MXNet-Horovod integration in production. We have made some
changes on both MXNet and Horovod sides. The changes on MXNet side have
mostly been merged and we are working to merge code to horovod repo. We
will send a design doc to you for review again next week.

Thanks for your feedback,

Lin

On Wed, Oct 31, 2018 at 12:03 PM Mu Li  wrote:

> Thanks for your contribution, Carl.
>
> I remember I left a comment on the proposal, but today I found it was
> disappeared. My suggestion is trying best to not change the existing API.
> The reason is that we need to change all trainers on the frontend that uses
> the existing kvstore APIs, which may cause confusion to users.
>
> The current proposal wants add the following 4 APIs into kvstore:
>
>
>-
>
>kv.pushpull
>-
>
>kv.broadcast
>-
>
>kv.local_rank
>-
>
>kv.num_local_workers
>
>
> Pushpull can be done with a sequential push and pull, you can do nothing in
> push and put all workloads into pushpull. Broadcast can be implemented by
> pull.
>
> What's local workers? GPUs in the single machine? If so, we can query it
> directly.
>
>
> On Fri, Sep 14, 2018 at 4:46 PM Carl Yang  wrote:
>
> > Hi,
> >
> > Currently, MXNet distributed can only be done using parameter server.
> > Horovod is an open-source distributed training framework that has
> > shown 2x speedup compared to TensorFlow using Parameter Server. We
> > propose to add Horovod support to MXNet. This will help our users
> > achieve goal of linear scalability to 256 GPUs and beyond. Design
> > proposal on cwiki:
> >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Horovod-MXNet+Integration
> >
> > Please feel free to let me know if you have any suggestions or feedback.
> >
> > Regards,
> > Carl
> >
>


Re: Coverity scan

2018-11-02 Thread Lin Yuan
Anton,

Yes, I did a scan using Coverity on MXNet a few months ago. It did show
some memory issues. I was later buried by other work with higher priority
and would definitely like to see Coverity (or any other better memory scan)
tool to be run regularly on MXNet backend.

Let me know if you want to discuss further on this topic. I would like to
provide as much help as I can.

Best,

Lin

On Fri, Nov 2, 2018 at 3:15 PM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> Totally agree Pedro, reporting the data in a more accessible way would be a
> huge improvement.  For this reason alone I think it might be worthwhile
> adopting coverity.
>
> On Fri, Nov 2, 2018, 11:38 AM Pedro Larroy  wrote:
>
> > Thanks a lot, I think is very beneficial that we invest in these kind of
> > tooling for code quality. As a developer I wonder, do we have actionable
> > items for looking at / fixing these issues or right now is done in an
> > informational / good will basis?
> >
> > Is there a way to colorize this output?
> >
> > Pedro.
> >
> > On Fri, Nov 2, 2018 at 5:10 PM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > Reference scan here (I believe I also count 5 memory violations):
> > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/incubator-mxnet/branches/master/runs/1856/nodes/104/log/?start=0
> > >
> > > -Kellen
> > >
> > > On Fri, Nov 2, 2018 at 9:07 AM kellen sunderland <
> > > kellen.sunderl...@gmail.com> wrote:
> > >
> > > > Hey Anton, can you provide a sample scan?  I'm interested to see if
> it
> > > > catches different memory access violations, or if it gets the same
> ones
> > > > we've already seen reported by clang-tidy.  For example are these
> > > > violations in the reports:
> > > > --
> > > >
> "/work/mxnet/3rdparty/dmlc-core/include/dmlc/concurrentqueue.h:3443:24:
> > > > warning: Access to field 'capacity' results in a dereference of a
> null
> > > > pointer (loaded from variable 'mainHash')
> > > > [clang-analyzer-core.NullDereference]"
> > > >
> > > > ---
> > > >
> > > > /work/mxnet/3rdparty/mshadow/mshadow/./tensor.h:64:23: warning:
> > Assigned
> > > value is garbage or undefined
> [clang-analyzer-core.uninitialized.Assign]
> > > >   this->shape_[i] = s[i];"
> > > >
> > > > -
> > > >
> > > >
> > > >
> > >
> >
> /usr/bin/../lib/gcc/x86_64-linux-gnu/8.0.1/../../../../include/c++/8.0.1/ext/atomicity.h:67:29:
> > > warning: Use of memory after it is freed
> > > [clang-analyzer-cplusplus.NewDelete]
> > > >
> > > > --
> > > >
> > > > -Kellen
> > > >
> > > >
> > > >
> > > > On Fri, Nov 2, 2018 at 2:20 AM Anton Chernov 
> > > wrote:
> > > >
> > > >> Dear MXNet community,
> > > >>
> > > >> I had investigated the possibility to adopt Coverity static analysis
> > > tools
> > > >> for the MXNet project and it turned out that there is a tool
> provided
> > by
> > > >> Synopsys for open-source projects:
> > > >>
> > > >> https://scan.coverity.com
> > > >>
> > > >> The tool works nicely with GitHub [1] and I found that a scan for a
> > fork
> > > >> (from @apeforest) [2] was already set up. I can not tell how long
> ago
> > > the
> > > >> scan was performed, but at the time of writing the project page
> shows
> > 5
> > > >> illegal memory access errors, that I think would be worth
> > investigating.
> > > >>
> > > >> If there is interest I would suggest that we would setup a Coverity
> > scan
> > > >> for the main repository instead of a fork and people that have
> > interest
> > > >> managing and fixing issues would request add them to the project.
> > > >>
> > > >> I would appreciate feedback for this proposal and help from people
> > > having
> > > >> rights for the main repository to set things up.
> > > >>
> > > >> Best regards,
> > > >> Anton
> > > >>
> > > >> [1] https://scan.coverity.com/github
> > > >> [2] https://scan.coverity.com/projects/apeforest-incubator-mxnet
> > > >>
> > > >
> > >
> >
>


Re: [Discussion] Recognise Reviewers, Besides Committers and PMC

2018-10-20 Thread Lin Yuan
+1 sounds like a great idea. We also need a mechanism to identify “good
reviewers”. Maybe we can count the number of :thumsup: in each review. Or
any other better way?

On Fri, Oct 19, 2018 at 8:22 PM Tianqi Chen 
wrote:

> Dear MXNet Community:
>
> There is a great discussion going on in terms of lowering the barrier of
> entries and encourage more contribution to the project.  One of the general
> goals is to encourage a broader pool of contributions. I want to make the
> following proposal:
>
> Besides Committers and PMC, let us also recognize Reviewers in the
> community.  This is a "pseudo role" as there is no such official role in
> Apache. But I want to explore the possibility of recognising active
> reviewers for example, by adding a list of names in the contributor list.
> In general, I find it is really helpful to have more code reviews.
> Recognising good reviewers early enables us to find candidate for
> committers, and encourage them to contribute and understand what is the bar
> of code quality that is required to merge the code.
>
> This can provide the community more evidence when recruiting new
> committers. After all committers is about write access to the code and
> understand the consequence of the responsibility -- which is usually can be
> found in high quality reviews.
>
> Please let me know what you think.
> Tianqi
>


[DISCUSS] Speedup non-code PR in CI

2018-11-06 Thread Lin Yuan
Dear Community,

I recently submitted a few small PRs with only changes in README files.
However, I noticed they still triggered the full cycle of CI including
build and test on all platforms.

Do we have a plan to speed up this process, maybe skipping non-code related
PRs in CI? Sorry, if this topic has been raised earlier and if not I
appreciate any comments.

Cheers,

Lin


Catch divide-by-zero floating number exception in backend

2018-11-08 Thread Lin Yuan
Dear MXNet Community,

I recently found the NaN errors sometimes could be due to some
divide-by-zero float number bugs in engine backend. However, by default,
such an exception will not be thrown. I added a signal trap to catch this
error (https://github.com/apache/incubator-mxnet/pull/13190) and caught a
few exceptions when running the python unit test. But this only works for
Linux OS.

I would like to get more feedback on the best practice to catch such bugs
in the code and if we should enforce such checks in CI. Any comment is
appreciated.

Best Regards,

Lin


[Question] Difference between "Feature" and "Feature request" labels in Github

2018-11-13 Thread Lin Yuan
Dear Community,

I often see there are "Feature" and "Feature request" labels in Github
issues. May I know the difference? If they are meant to be the same thing,
can we only keep one of them?

Thanks,

Lin


Re: [DISCUSS] Speedup non-code PR in CI

2018-11-06 Thread Lin Yuan
Kellen and Pedro,

Thanks for your pointers. I am not an expert in CI but one naive speedup I
can see is that if the PR only contains *.md file, then skip the build and
testing cycles. This can make documentation/correction easier and save
computation resource for other needed tests. Any side effect there?

Thanks,

Lin


Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release

2018-11-06 Thread Lin Yuan
Hi Anton,

Thanks for helping the release.
The following PRs are needed by customers who want to use deterministic
CUDNN convolution algorithms:

https://github.com/apache/incubator-mxnet/pull/12992
https://github.com/apache/incubator-mxnet/pull/13049

Thanks!

Lin


On Tue, Nov 6, 2018 at 1:51 PM Aaron Markham 
wrote:

> Hi Anton,
> I have the following suggestions for fixes to include in 1.3.1. These each
> have updates to files that will impact docs generation for the 1.3.x
> version of the website's Python API docs:
>
> https://github.com/apache/incubator-mxnet/pull/12879
> https://github.com/apache/incubator-mxnet/pull/12871
> https://github.com/apache/incubator-mxnet/pull/12856
>
> Thanks,
> Aaron
>
> On Tue, Nov 6, 2018 at 1:29 PM Lai Wei  wrote:
>
> > Hi Anton,
> >
> > Thanks for driving this, I would like to include the following fix in
> > 1.3.1:
> > Allow infer shape partial on foreach operator:
> > https://github.com/apache/incubator-mxnet/pull/12471
> >
> > Keras-MXNet needs this functionality to infer shape partially
> > on foreach operator. (Used in RNN operators)
> >
> > Thanks a lot!
> >
> >
> > Best Regards
> > Lai Wei
> >
> >
> >
> > On Tue, Nov 6, 2018 at 10:44 AM Haibin Lin 
> > wrote:
> >
> > > Hi Naveen and Anton,
> > >
> > > Thanks for pointing that out. You are right that these are not critical
> > > fixes. Putting them in 1.4.0 is more appropriate. PRs are closed.
> > >
> > > Best,
> > > Haibin
> > >
> > > On Tue, Nov 6, 2018 at 7:35 AM Naveen Swamy 
> wrote:
> > >
> > > > Please note that this is a patch release(1.3.1) to address critical
> > > bugs!,
> > > > For everything else please wait for 1.4.0 which is planned very
> shortly
> > > > after 1.3.1
> > > >
> > > > > On Nov 6, 2018, at 7:17 AM, Anton Chernov 
> > wrote:
> > > > >
> > > > > The following PR's have been created so far:
> > > > >
> > > > > Infer dtype in SymbolBlock import from input symbol (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13117
> > > > >
> > > > > [MXNET-953] Fix oob memory read (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13118
> > > > >
> > > > > [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13119
> > > > >
> > > > > [MXNET-922] Fix memleak in profiler (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13120
> > > > >
> > > > > Set correct update on kvstore flag in dist_device_sync mode
> (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13121
> > > > >
> > > > > update mshadow (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13122
> > > > >
> > > > > CudnnFind() usage improvements (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13123
> > > > >
> > > > > Fix lazy record io when used with dataloader and multi_worker > 0
> > > > (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13124
> > > > >
> > > > >
> > > > > As stated previously I would be rather opposed to have following
> PR's
> > > it
> > > > in
> > > > > the patch release:
> > > > >
> > > > > Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
> > > > > https://github.com/apache/incubator-mxnet/pull/13129
> > > > >
> > > > > sample_like operators (#13034) v1.3.x
> > > > > https://github.com/apache/incubator-mxnet/pull/13130
> > > > >
> > > > >
> > > > > Best
> > > > > Anton
> > > > >
> > > > > вт, 6 нояб. 2018 г. в 16:06, Anton Chernov :
> > > > >
> > > > >> Hi Haibin,
> > > > >>
> > > > >> I have a few comments regarding the proposed performance
> improvement
> > > > >> changes.
> > > > >>
> > > > >> CUDNN support for LSTM with projection & clipping
> > > > >> https://github.com/apache/incubator-mxnet/pull/13056
> > > > >>
> > > > >> There is no doubt that this change brings value, but I don't see
> it
> > > as a
> > > > >> critical bug fix. I would rather leave it for the next major
> > release.
> > > > >>
> > > > >> sample_like operators
> > > > >> https://github.com/apache/incubator-mxnet/pull/13034
> > > > >>
> > > > >> Even if it's related to performance, this is an addition of
> > > > functionality
> > > > >> and I would also push this to be in the next major release only.
> > > > >>
> > > > >>
> > > > >> Best
> > > > >> Anton
> > > > >>
> > > > >>
> > > > >> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov :
> > > > >>
> > > > >>> Hi Patric,
> > > > >>>
> > > > >>> This change was listed in the 'PR candidates suggested for
> > > > consideration
> > > > >>> for v1.3.1 patch release' section [1].
> > > > >>>
> > > > >>> You are right, I also think that this is not a critical hotfix
> > change
> > > > >>> that should be included into the 1.3.1 patch release.
> > > > >>>
> > > > >>> Thus I'm not making any further efforts to bring it in.
> > > > >>>
> > > > >>> Best
> > > > >>> Anton
> > > > >>>
> > > > >>> [1]
> > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
> > > > 

Re: [Question] Difference between "Feature" and "Feature request" labels in Github

2018-11-13 Thread Lin Yuan
Thanks guys for your prompt actions. I am so impressed!

Lin

On Tue, Nov 13, 2018 at 5:33 PM Sheng Zha  wrote:

> I was in the middle of transferring all items labeled with "Feature" to the
> "Feature request" label when "Feature" label was deleted. I'm not sure who
> deleted the "Feature" label but it's gone now.
>
> -sz
>
> On Tue, Nov 13, 2018 at 5:05 PM Anirudh Acharya 
> wrote:
>
> > This issue was raised before here -
> >
> >
> https://lists.apache.org/thread.html/3e988e6bd82cb2d69ba20c21bf763952ed22a5732e61f6fba1f89ac8@%3Cdev.mxnet.apache.org%3E
> >
> > We need someone with committer privileges to fix it.
> >
> >
> > Thanks
> > Anirudh
> >
> >
> >
> > On Tue, Nov 13, 2018 at 4:36 PM Lin Yuan  wrote:
> >
> > > Dear Community,
> > >
> > > I often see there are "Feature" and "Feature request" labels in Github
> > > issues. May I know the difference? If they are meant to be the same
> > thing,
> > > can we only keep one of them?
> > >
> > > Thanks,
> > >
> > > Lin
> > >
> >
>


Re: CUDNN algorithm selection failure

2018-10-01 Thread Lin Yuan
I could not reproduce the error on an EC2 g3x8 instance making it hard to
debug. I also suspect it was due to resource usage limit on ci   Instance.

On Mon, Oct 1, 2018 at 10:40 PM Pedro Larroy 
wrote:

> It doesn't look like flakiness to me at first sight. I think it might be
> related to resource usage / allocation / leak in the worst case.
>
> Could be that there was not enough memory GPU memory at the time of test
> execution. But I'm just speculating, hence my original question.
>
> Pedro.
>
> On Mon, Oct 1, 2018 at 8:16 PM Lin Yuan  wrote:
>
> > Hi Pedro,
> >
> > I also got this failure in my PR
> >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11742/27/pipeline
> >
> > I was not able to identify the root cause of it from changelist. Are you
> > suggesting there is some flakiness in the master branch too?
> >
> > Thanks,
> >
> > Lin
> >
> > On Mon, Oct 1, 2018 at 4:55 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > I saw this failure on CI:
> > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1697/pipeline
> > >
> > > Have you seen other cases where we fail to select the best CUDNN
> > algorithm?
> > > In which circumstances this could happen, and do you think is a good
> idea
> > > to have one selected by default as a last resort?
> > >
> > >
> > > Pedro.
> > >
> >
>


Re: CUDNN algorithm selection failure

2018-10-01 Thread Lin Yuan
Hi Pedro,

I also got this failure in my PR
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11742/27/pipeline

I was not able to identify the root cause of it from changelist. Are you
suggesting there is some flakiness in the master branch too?

Thanks,

Lin

On Mon, Oct 1, 2018 at 4:55 PM Pedro Larroy 
wrote:

> Hi
>
> I saw this failure on CI:
>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1697/pipeline
>
> Have you seen other cases where we fail to select the best CUDNN algorithm?
> In which circumstances this could happen, and do you think is a good idea
> to have one selected by default as a last resort?
>
>
> Pedro.
>


Re: Propose to discontinue supporting Apache MXNet on Windows 7

2018-09-03 Thread Lin Yuan
> > > > > On the other hand the lack of data should not prevent us from
> > > moving
> > > > > forward and dropping support for outdated OS.
> > > > > In any case we would have to announce dropping a platform
> support
> > > at
> > > > least
> > > > > a release in advance.
> > > > > Steffen
> > > > >
> > > > > On Thu, Aug 30, 2018 at 12:21 PM Sheng Zha <
> zhash...@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi Kellen,
> > > > > >
> > > > > > Thanks for the explanation. Unfortunately, I don't have the
> > > usage data,
> > > > > so
> > > > > > I refrained from voting. If any of the voters have such data
> > I'd
> > > love
> > > > to
> > > > > > see it too.
> > > > > >
> > > > > > -sz
> > > > > >
> > > > > > On 2018/08/30 14:58:09, kellen sunderland <
> > > kellen.sunderl...@gmail.com
> > > > >
> > > > > > wrote:
> > > > > > > I haven't spoken to anyone about the decision (as I'm
> > > currently on an
> > > > > > > island in the med) but to me the quick +1s are likely a
> > result
> > > of
> > > > this
> > > > > > > being a fairly straightforward decision.  The factors that
> > > went into
> > > > my
> > > > > > > thinking were (1) prioritizing growing platforms rather
> than
> > > > shrinking
> > > > > > > platforms (i.e. thinking long term rather than shirt term)
> > and
> > > (2)
> > > > > > earning
> > > > > > > our customers' trust.  Claiming support for a platform when
> > we
> > > can't
> > > > > > > realistically deliver it would lose us trust.  I'd prefer
> to
> > > over
> > > > > deliver
> > > > > > > and under promise when it come to windows 7 for this
> reason.
> > > > > > >
> > > > > > > Now on the flip side one thing I would see as valuable is
> to
> > > try and
> > > > > get
> > > > > > > windows builds working with clang.  This could be
> beneficial
> > > in the
> > > > > sense
> > > > > > > that it would be easy to maintain for mxnet devs and allow
> us
> > > to use
> > > > > > modern
> > > > > > > cpp on older windows machines without using vs 2013(which I
> > > consider
> > > > a
> > > > > > > non-starter with our codebase).
> > > > > > >
> > > > > > > You have peaked my curiousity though Sheng.  How many win7
> > > users does
> > > > > > MXNet
> > > > > > > have relative to macos/Linux?
> > > > > > >
> > > > > > > On Thu, Aug 30, 2018, 8:51 AM Sheng Zha <
> szha@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Yuan,
> > > > > > > >
> > > > > > > > No problem. This is an issue that's worth having a clear
> > > > definition,
> > > > > so
> > > > > > > > there's nothing wrong about your proposal, and thanks for
> > > bringing
> > > > > > this up.
> > > > > > > >
> > > > > > > > I'm more concerned about the seemingly unanimous votes on
> > > dropping
> > > > > > support
> > > > > > > > on a platform without seeing the supporting evidence that
> > > it's the
> > > > > > right
> > > > > > > > thing. It is as if everyone who participated in the vote
> > are
> > > > already
> > > > > > on the
> > > > > > > > same page, and somehow I'm the only one that's not. But
> the
> > > only
> > > > > > argument I
> > > > > > > > hear so far is that it's technically not straightf

Re: [LAZY VOTE] Consolidating developer guide in one place (cwiki preferred)

2018-09-26 Thread Lin Yuan
Hi Aaron,

Do we have a resolution for this proposal yet? Recently, there have been
many asks for a better documentation for MXNet developers. I think it's a
good time that we consolidate the developer documentation in a central
place. Any thoughts or plan?

Many Thanks,

Lin

On Tue, Sep 4, 2018 at 1:55 PM Lin Yuan  wrote:

> +1
>
> On Tue, Sep 4, 2018 at 1:46 PM Aaron Markham 
> wrote:
>
>> I'd like to call for a lazy vote on this before proceeding. Already had
>> some +1s but let's be sure.
>>
>> The vote is to move developer guide info to cwiki. User guides would
>> remain
>> on the website.
>>
>> On Tue, Aug 21, 2018 at 12:53 PM sandeep krishnamurthy <
>> sandeep.krishn...@gmail.com> wrote:
>>
>> > +1
>> > Thanks Lin and Aaron. I agree website to cover all user facing
>> > documentation and a separate consolidated and organized developer
>> focussed
>> > docs in one place (cwiki).
>> >
>> >
>> > Note: Permissions on cwiki is currently not well managed with many
>> people
>> > having full admin rights to edit/create/delete pages. Should be fine for
>> > now, but, when we start accumulating many documents and resources, we
>> > should probably revisit on Delete permissions.
>> >
>> >
>> > On Tue, Aug 21, 2018 at 11:57 AM Lin Yuan  wrote:
>> >
>> > > Hi Aaron,
>> > >
>> > > Thanks for your answer. I think it's a very worthwhile effort to move
>> all
>> > > the developer related content from mxet.io website to a dedicated
>> > > developer
>> > > site. Would you like to initiate this effort?
>> > >
>> > > Best,
>> > >
>> > > Lin
>> > >
>> > > On Wed, Aug 15, 2018 at 3:47 PM Haibin Lin 
>> > > wrote:
>> > >
>> > > > +1
>> > > >
>> > > > On Wed, Aug 15, 2018 at 1:10 PM, Aaron Markham <
>> > > aaron.s.mark...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Hi Lin, I agree with this organization. If you feel like
>> somethings
>> > > > should
>> > > > > be transitioned from the website to the wiki, I can help with
>> that,
>> > but
>> > > > for
>> > > > > the moment I've been suggesting that new developer-focused
>> content be
>> > > > > placed on the wiki.
>> > > > >
>> > > > > On Tue, Aug 14, 2018 at 10:40 AM, Lin Yuan 
>> > > wrote:
>> > > > >
>> > > > > > Dear MXNet community,
>> > > > > >
>> > > > > > As a developer, I noticed we have some developer guide
>> scattered in
>> > > > > > different websites (mxnet.io, cwiki):
>> > > > > >
>> > > > > > E.g.
>> > > > > >
>> > > > > > How to Create New Operators (Layers): [
>> > > > > > https://mxnet.incubator.apache.org/faq/new_op.html]
>> > > > > > A Guide to Implementing Sparse Operators in MXNet Backend [
>> > > > > > https://cwiki.apache.org/confluence/display/MXNET/A+
>> > > > > > Guide+to+Implementing+Sparse+Operators+in+MXNet+Backend
>> > > > > > ]
>> > > > > >
>> > > > > > When searching developer guide by keyword, only one of them can
>> be
>> > > > > returned
>> > > > > > on either site.
>> > > > > >
>> > > > > > It will be more convenient for developers if all the developer
>> > guide
>> > > > > > resides on cwiki and all user guide (non-developer) on the
>> > mxnet.io
>> > > > > > website. We can add a link on mxnet.io to refer all developers
>> to
>> > > > cwiki
>> > > > > > for
>> > > > > > guidance.
>> > > > > >
>> > > > > > Any comment is appreciated.
>> > > > > >
>> > > > > > Best Regards,
>> > > > > >
>> > > > > > Lin
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> >
>> > --
>> > Sandeep Krishnamurthy
>> >
>>
>


Re: [LAZY VOTE] Consolidating developer guide in one place (cwiki preferred)

2018-09-28 Thread Lin Yuan
Hi Aaron,

Thanks a lot for effort. This consolidation will make it more convenient
for developers to find development resource and help to attract more
contributors.

I have also created a story to make it easy for developers to navigate from
mxnet.io: https://issues.apache.org/jira/browse/MXNET-1002

Thanks!

Lin

On Wed, Sep 26, 2018 at 10:24 AM Aaron Markham 
wrote:

> I think the latest feedback has been great. It seems to be mostly user
> level issues though. Installation and usage primarily, with a sprinkle of
> *if that stuff was better then I might be able to contribute*.
>
> I've (with a few other contributors) tackled some of the very direct bits
> of feedback for the website by incremental improvement of the install
> pages, Gluon info, and UX for the API docs.
>
> I've started additional planning for updates by adding an epic with
> specific stories and tasks to Jira for the documentation pipeline (the
> backend part of the website build):
> https://issues.apache.org/jira/browse/MXNET-957
>
> I've also added one that is more specific to the website's content:
> https://issues.apache.org/jira/browse/MXNET-986
> This is where I've captured only two tasks related to transitioning content
> related to "contributing to MXNet" over to the wiki. Any pointers on which
> content to move would help. These could be added as tasks too.
>
> I welcome any suggestions, additions, and contributions to either of these
> epics.
>
> Cheers,
> Aaron
>
> On Wed, Sep 26, 2018, 00:02 Lin Yuan  wrote:
>
> > Hi Aaron,
> >
> > Do we have a resolution for this proposal yet? Recently, there have been
> > many asks for a better documentation for MXNet developers. I think it's a
> > good time that we consolidate the developer documentation in a central
> > place. Any thoughts or plan?
> >
> > Many Thanks,
> >
> > Lin
> >
> > On Tue, Sep 4, 2018 at 1:55 PM Lin Yuan  wrote:
> >
> > > +1
> > >
> > > On Tue, Sep 4, 2018 at 1:46 PM Aaron Markham <
> aaron.s.mark...@gmail.com>
> > > wrote:
> > >
> > >> I'd like to call for a lazy vote on this before proceeding. Already
> had
> > >> some +1s but let's be sure.
> > >>
> > >> The vote is to move developer guide info to cwiki. User guides would
> > >> remain
> > >> on the website.
> > >>
> > >> On Tue, Aug 21, 2018 at 12:53 PM sandeep krishnamurthy <
> > >> sandeep.krishn...@gmail.com> wrote:
> > >>
> > >> > +1
> > >> > Thanks Lin and Aaron. I agree website to cover all user facing
> > >> > documentation and a separate consolidated and organized developer
> > >> focussed
> > >> > docs in one place (cwiki).
> > >> >
> > >> >
> > >> > Note: Permissions on cwiki is currently not well managed with many
> > >> people
> > >> > having full admin rights to edit/create/delete pages. Should be fine
> > for
> > >> > now, but, when we start accumulating many documents and resources,
> we
> > >> > should probably revisit on Delete permissions.
> > >> >
> > >> >
> > >> > On Tue, Aug 21, 2018 at 11:57 AM Lin Yuan 
> > wrote:
> > >> >
> > >> > > Hi Aaron,
> > >> > >
> > >> > > Thanks for your answer. I think it's a very worthwhile effort to
> > move
> > >> all
> > >> > > the developer related content from mxet.io website to a dedicated
> > >> > > developer
> > >> > > site. Would you like to initiate this effort?
> > >> > >
> > >> > > Best,
> > >> > >
> > >> > > Lin
> > >> > >
> > >> > > On Wed, Aug 15, 2018 at 3:47 PM Haibin Lin <
> > haibin.lin@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > +1
> > >> > > >
> > >> > > > On Wed, Aug 15, 2018 at 1:10 PM, Aaron Markham <
> > >> > > aaron.s.mark...@gmail.com>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hi Lin, I agree with this organization. If you feel like
> > >> somethings
> > >> > > > should
> > >> > > > > be transitioned from the website to the wiki, I can help with
> > >> that,
> > >> > but
> > >> > > > f

Re: [DISCUSS] Use modernized C++11 range loops uniformly throughout the project

2018-09-28 Thread Lin Yuan
+1

Using range-based for-loop whenever possible improves code readability and
makes code less prone to human error.

I did some preliminary research on Google and did not find any complaint
about its performance drawback. Here is one piece from StackOverflow for
reference:
https://stackoverflow.com/questions/10821756/is-the-ranged-based-for-loop-beneficial-to-performance

Lin

On Fri, Sep 28, 2018 at 7:42 AM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> "Range loops aren’t always the most performant way" Do you have an example
> where there's a perf difference?
>
> "In addition, sometimes you want the index. Or maybe you want to iterate
> backwards, or not start from the first, etc. Maybe you want the iterator
> because you remove it from the list at the bottom of the loop Seems
> like a rule for the sake of having a rule."
>
> I should have been more clear about this point.  If you're using the index
> in the loop, doing reverse iteration, or not iterating from start-to-end
> this inspection is smart enough to realize it and will not suggest
> optimizing that type of loop.  The loops that would be changes are _only_
> the loops which are detected as equivalent to range-loops.  Examples can be
> found here:
> https://clang.llvm.org/extra/clang-tidy/checks/modernize-loop-convert.html
> or you can look at what's been changed in the ref PR.  I've initially set
> our confidence level at 'reasonable' but we could also set to 'safe' which
> would further reduce the number of loops the check would apply to.
>
> -Kellen
>
> On Fri, Sep 28, 2018 at 3:54 PM Chris Olivier 
> wrote:
>
> > -1
> >
> > Range loops aren’t always the most performant way. In addition, sometimes
> > you want the index. Or maybe you want to iterate backwards, or not start
> > from the first, etc. Maybe you want the iterator because you remove it
> from
> > the list at the bottom of the loop Seems like a rule for the sake of
> > having a rule.
> >
> > On Fri, Sep 28, 2018 at 2:12 AM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > Hello MXNet devs,
> > >
> > > I'd like to discuss uniformly adopting C++11 range loops in the MXNet
> > > project.  The benefits I see are:
> > >
> > > *  Improved C++ readability (examples below).
> > > *  Consistency with other languages.  The range-loops are quite similar
> > to
> > > loops almost all other programming languages.  Given we're a project
> that
> > > supports many languages this language consistency could be positive for
> > our
> > > community.
> > > * Consistency within the same project.  Currently different authors
> have
> > > different loops styles which hurts codebase readability.
> > > *  Best available performance.  There are often multiple ways to write
> > > loops in C++ with subtle differences in performance and memory usage
> > > between loop methods.  Using range-loops ensures we get the best
> possible
> > > perf using an intuitive loop pattern.
> > > *  Slightly lower chance for bugs / OOB accesses when dealing with
> > indexing
> > > in an array for example.
> > >
> > > If we decide to enable this uniformly throughout the project we can
> > enable
> > > this policy with a simple clang-tidy configuration change.  There would
> > be
> > > no need for reviewers to have to manually provide feedback when someone
> > > uses an older C++ loops style.
> > >
> > > -Kellen
> > >
> > > Reference PR:  https://github.com/apache/incubator-mxnet/pull/12356/
> > > Previous clang-tidy discussion on the list:
> > >
> > >
> >
> https://lists.apache.org/thread.html/b0ae5a9df5dfe0d9074cb2ebe432264db4fa2175b89fa43a5f6e36be@%3Cdev.mxnet.apache.org%3E
> > >
> > > -
> > > Examples:
> > > for (auto axis_iter = param.axis.begin() ; axis_iter!=
> param.axis.end();
> > > ++axis_iter) {
> > > CHECK_LT(*axis_iter, static_cast(ishape.ndim()));
> > > stride_[reverse_index] = ishape[*axis_iter];
> > > ...
> > > -->
> > > for (int axis : param.axis) {
> > > CHECK_LT(axis, static_cast(ishape.ndim()));
> > > stride_[reverse_index] = ishape[axis];
> > > ...
> > > --
> > > for (size_t i = 0; i < in_array.size(); i++) {
> > > auto  = in_array[i];
> > > pre_temp_buf_.emplace_back(nd.shape(), nd.ctx(), true, nd.dtype());
> > > }
> > > -->
> > > for (auto & nd : in_array) {
> > > pre_temp_buf_.emplace_back(nd.shape(), nd.ctx(), true, nd.dtype());
> > > }
> > >
> >
>


Re: [Annoucement] New Committer -- Iblis Lin

2019-01-05 Thread Lin Yuan
Welcome Iblis,

Great to see a good Julia support in MXNet!

Lin

On Sat, Jan 5, 2019 at 12:32 PM Marco de Abreu 
wrote:

> Welcome Iblis,
>
> great to have you on board!
>
> -Marco
>
> Am Sa., 5. Jan. 2019, 21:13 hat Carin Meier 
> geschrieben:
>
> > Please join me in welcoming Iblis Lin as a new committer.
> >
> > He has been a long time contributor to the Julia package, is responsible
> > for bringing into the main MXNet repo, and is the current maintainer.
> >
> > https://github.com/apache/incubator-mxnet/commits?author=iblis17
> >
> > - Carin Meier
> >
>


Re: Apache MXNet v1.4.0 release status

2019-01-15 Thread Lin Yuan
Hi Steffen,

I would like to ask to include one more PR for 1.4.0.rc1:
https://github.com/apache/incubator-mxnet/pull/13845

This PR exports exception handling API of MXNet. It is needed by Horovod
with MXNet integration to elegantly throw exception at Python level rather
than a C++ abort.

Thanks,

Lin


On Tue, Jan 15, 2019 at 2:24 PM Steffen Rochel 
wrote:

> Dear MXNet community -
> Zach & friends made good progress resolving the licensing issues. One more
> PR on 1.4.x branch is expected today.
> The code freeze for 1.4.0.rc1 is Thursday Jan 17th 6pm PST.
> I'm asking the requester to add following PR to 1.4.x branch:
> Tao:
> https://github.com/apache/incubator-mxnet/pull/13882
> Kellen:
> https://github.com/apache/incubator-mxnet/pull/13697
> https://github.com/apache/incubator-mxnet/pull/13188
> https://github.com/apache/incubator-mxnet/pull/13727
> https://github.com/apache/incubator-mxnet/pull/13695
> Pedro:
> https://github.com/apache/incubator-mxnet/pull/13535
>
> If there are additional PR to be considered for 1.4.0.rc1 please send
> request to dev@.
>
> Regards,
> Steffen
>
> On Tue, Jan 8, 2019 at 11:28 AM Qing Lan  wrote:
>
> > Hi all,
> >
> > I added a section F in the document that explained the current
> > static-linked dependencies we used for official release. As there are a
> few
> > licenses are under BSD3 and GPL, we need to handle them in our next
> > release. Please take a look and leave any concerns you may have.
> >
> > Thanks,
> > Qing
> >
> > On 1/7/19, 8:33 PM, "kellen sunderland" 
> > wrote:
> >
> > So I see two quick options that should cut down on the dependency
> > licenses
> > required for TRT in the source release.
> >
> > 1: We can simply remove in the release package the submodules for
> onnx
> > in
> > folder
> > incubator-mxnet/3rdparty/onnx-tensorrt/third_party/onnx/third_party.
> > None of those dependencies are used in the build (I've just verified
> > locally on my machine).
> > 2: We can make a cmake based checkout system and ensure we only
> > checkout
> > the required files when TRT builds are enabled (similar to the
> current
> > mkl-ml setup).
> >
> > I'd suggest option 1 for this release, and that we support option 2
> > for the
> > 1.5 release.
> >
> > On Mon, Jan 7, 2019 at 8:19 PM Lv, Tao A  wrote:
> >
> > > What should I do for the double headers in
> > 3rdparty/mkldnn/src/cpu/xbyak/?
> > >
> > > -tao
> > >
> > > -Original Message-
> > > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > > Sent: Tuesday, January 8, 2019 10:51 AM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: Re: Apache MXNet v1.4.0 release status
> > >
> > > Kellen and Tao -
> > > yes, the understanding is that dependencies need to be considered
> > and all
> > > licences referenced to include in top level LICENSE file.
> > > Appreciate your help with it.
> > > Steffen
> > >
> > > On Mon, Jan 7, 2019 at 6:39 PM kellen sunderland <
> > > kellen.sunderl...@gmail.com> wrote:
> > >
> > > > Sorry to hear about the licensing issues.  I was following the
> > general
> > > > vote but I'm still lacking some clarity around what licenses in
> the
> > > > onnx-trt repo need to be surfaced.  I believe onnx-trt is MIT
> > > > licensed, but it includes Onnx as a third party repo which then
> > brings
> > > > in dependencies with a variety of licenses.  The proposal is that
> > we
> > > > look at these on an individual basis and then add them to our top
> > level
> > > LICENSE file right?
> > > >
> > > > An alternative is that we may be able to checkout a smaller
> source
> > > > code dependency tree if we remove a few unneeded ONNX's
> > dependencies
> > > > (pybind and google-bench).  My hope is that this wouldn't affect
> > our
> > > > compilation process and would get us down to two licenses to
> report
> > > > (just Onnx and Onnx-TRT, both MIT).
> > > >
> > > > On Mon, Jan 7, 2019 at 6:07 PM Meghna Baijal
> > > > 
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > > For some more context, these were the last emails I sent on the
> > dev
> > > > > and legal lists requesting help on the open questions  –
> > > > >
> > > > > 1. Question on legal about the CC-By-2.5 <
> > > > >
> > > >
> > http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201805.mbox
> > > > /%3CCAK1xzDe6ECToKt_2cTR_7txQQCwHeYfvxXDfmuGgfA3jaTs=
> > j...@mail.gmail.com
> > > > %3E
> > > > > >
> > > > > 2. Question on dev about googletest file <
> > > > >
> > > >
> > http://mail-archives.apache.org/mod_mbox/mxnet-dev/201804.mbox/%3CCAMG
> > > > gKDC8szdfFqQhhSNpwwT_3zi4LBS7A=u4v7kj4ule44u...@mail.gmail.com
> %3E
> > > > > >
> > > > > 3. General Request for review of the licenses wiki <
> > > > >
> > > >
> > 

Re: [Annoucement] New Committer -- Da Zheng

2018-12-17 Thread Lin Yuan
Congrats!

On Mon, Dec 17, 2018 at 9:19 AM Steffen Rochel 
wrote:

> Congratulation Da!
>
> On Mon, Dec 17, 2018 at 9:02 AM Tianqi Chen  wrote:
>
> > Dear Community:
> >
> > Please join me to welcome Da Zheng as a new committer of the MXNet.
> >
> > Da is the main author of MKL-DNN integration and recently he champions
> the
> > control flow support. He is one of the few "explorer style" contributors
> of
> > the community, who we desperately need in this fast change environment of
> > the deep learning system landscape.
> >
> > PRs https://github.com/apache/incubator-mxnet/commits?author=zheng-da
> > reviews  *
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> > <
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Azheng-da+
> > >*
> > dev@  https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:da-
> > zheng
> >
> > Tianqi
> >
>


[Question] UI change policy in MXNet

2018-12-20 Thread Lin Yuan
Dear Community,

As a contributor, I would like to know the current policy for updating UI
of an operator. I understand UI change should be introduced in major
release not minor release. However, it is still not quite clear to me
regarding the UI change process:

1) Which guideline should we follow when updating the UI in MXNet operators?
2) Who should approve the UI change?
3) In case of backward compatibility, should we favor breaking the backward
compatibility and update the release notes or adding a newer version of the
operator like ***_v2?
4) Which operator should go to contrib and which be implemented as regular?

Any clarification is appreciated and it is helpful to guide PR reviewers as
well.

Merry Christmas to ya'all!

Lin


Re: [Question] UI change policy in MXNet

2018-12-20 Thread Lin Yuan
Hi Anirudh,

Thanks a lot for your clarifications! I have some followup
questions/comments:

1) Which guideline should we follow when updating the UI in MXNet operators?
A) MXNet follows semantic versioning, so breaking changes to operator
interfaces can be introduced only in major versions.

(Lin:) My question is what style of UI guide we should follow. e.g. naming
convension, usage mode, etc. Something like numpy's style or tensorflow?

2) Who should approve the UI change?
A) Contributors who may have worked on the operator and/or other
contributors/committers.

(Lin:) Is it too local to reply on contributors to one/a few operators to
decide the UI. How can we make sure the consistency of UI across all
operators in MXNet?

3) In case of backward compatibility, should we favor breaking the backward
compatibility and update the release notes or adding a newer version of the
operator like ***_v2?
A) If the operator interfaces are not compatible, its fine to create
operator with the name "_v2" . In the next major version release, you can
add an alias for newer implementation and deprecate the older one.

(Lin) What if there is already "_v2", do we add "_v3", "_v4" as the project
evolves?

4) Which operator should go to contrib and which be implemented as regular?
A) I think this discussion may help:
https://github.com/apache/incubator-mxnet/pull/5499 . To summarize: contrib
was created for ops for which we provide limited guarantees with respect to
backward compatibility, interface changes, testing etc.

(Lin) This is definitely an informative discussion. It would be better if
we can put this in a more noticeable place for developers.


On Thu, Dec 20, 2018 at 1:39 PM Anirudh Subramanian 
wrote:

> 1) Which guideline should we follow when updating the UI in MXNet
> operators?
> A) MXNet follows semantic versioning, so breaking changes to operator
> interfaces can be introduced only in major versions.
>
> 2) Who should approve the UI change?
> A) Contributors who may have worked on the operator and/or other
> contributors/committers.
>
> 3) In case of backward compatibility, should we favor breaking the backward
> compatibility and update the release notes or adding a newer version of the
> operator like ***_v2?
> A) If the operator interfaces are not compatible, its fine to create
> operator with the name "_v2" . In the next major version release, you can
> add an alias for newer implementation and deprecate the older one.
>
> 4) Which operator should go to contrib and which be implemented as regular?
> A) I think this discussion may help:
> https://github.com/apache/incubator-mxnet/pull/5499 . To summarize:
> contrib
> was created for ops for which we provide limited guarantees with respect to
> backward compatibility, interface changes, testing etc.
>
> Anirudh
>
> On Thu, Dec 20, 2018 at 1:00 PM Lin Yuan  wrote:
>
> > Dear Community,
> >
> > As a contributor, I would like to know the current policy for updating UI
> > of an operator. I understand UI change should be introduced in major
> > release not minor release. However, it is still not quite clear to me
> > regarding the UI change process:
> >
> > 1) Which guideline should we follow when updating the UI in MXNet
> > operators?
> > 2) Who should approve the UI change?
> > 3) In case of backward compatibility, should we favor breaking the
> backward
> > compatibility and update the release notes or adding a newer version of
> the
> > operator like ***_v2?
> > 4) Which operator should go to contrib and which be implemented as
> regular?
> >
> > Any clarification is appreciated and it is helpful to guide PR reviewers
> as
> > well.
> >
> > Merry Christmas to ya'all!
> >
> > Lin
> >
>


Re: [Announce] Upcoming Apache MXNet (incubating) 1.4.0 release

2018-11-29 Thread Lin Yuan
Hi Steffen,

Can we add the following PR to 1.4.0 release:

https://github.com/apache/incubator-mxnet/pull/13452

It's just a Python API returning header path so it should not cause any
regression issues. But it is required for Horovod to integrate MXNet. It's
better to have this in a minor release than patch release.

Thanks,

Lin

On Thu, Nov 29, 2018 at 6:46 PM Steffen Rochel 
wrote:

> Hi Zhi - thanks for the improvement, which we should consider for 1.4.0.
> However, I don't see any tests with the PR and think it is too risky to add
> changes without tests. I will add your PR to the tracking list, but would
> like to ask you to add functional tests before completing the PR to master
> and v1.4.x branch.
>
> Steffen
>
> On Thu, Nov 29, 2018 at 5:01 PM Joshua Z. Zhang 
> wrote:
>
> > Hi, I would like to bring a critical performance and stability patch of
> > existing gluon dataloader to 1.4.0:
> > https://github.com/apache/incubator-mxnet/pull/13447 <
> > https://github.com/apache/incubator-mxnet/pull/13447>.
> >
> > This PR is finished, waiting for CI to pass.
> >
> > Steffen, could you help me add that to the tracked list?
> >
> > Best,
> > Zhi
> >
> > > On Nov 29, 2018, at 4:25 PM, Naveen Swamy  wrote:
> > >
> > > the tests are randomly failing in different stages
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-13105/
> > > This PR has failed 8 times so far
> > >
> > > On Thu, Nov 29, 2018 at 3:43 PM Steffen Rochel <
> steffenroc...@gmail.com>
> > > wrote:
> > >
> > >> Pedro - ok. Please add PR to v1.4.x branch after merge to master and
> > please
> > >> update tracking page
> > >> <
> > >>
> >
> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Plan+and+Status#ApacheMXNet(incubating)1.4.0ReleasePlanandStatus-OpenPRstotrack
> > >>>
> > >> .
> > >> Steffen
> > >>
> > >> On Thu, Nov 29, 2018 at 3:00 PM Pedro Larroy <
> > pedro.larroy.li...@gmail.com
> > >>>
> > >> wrote:
> > >>
> > >>> PR is ready from my side and passes the tests, unless somebody raises
> > >>> any concerns it's good to go.
> > >>> On Thu, Nov 29, 2018 at 9:50 PM Steffen Rochel <
> > steffenroc...@gmail.com>
> > >>> wrote:
> > 
> >  Pedro - added  to 1.4.0 tracking list
> >  <
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Plan+and+Status#ApacheMXNet(incubating)1.4.0ReleasePlanandStatus-OpenPRstotrack
> > 
> > 
> >  Do you have already ETA?
> >  Steffen
> > 
> >  On Thu, Nov 29, 2018 at 6:13 AM Pedro Larroy <
> > >>> pedro.larroy.li...@gmail.com>
> >  wrote:
> > 
> > > Hi all.
> > >
> > > There are two important issues / fixes that should go in the next
> > > release in my radar:
> > >
> > > 1) https://github.com/apache/incubator-mxnet/pull/13409/files
> > > There is a bug in shape inference on CPU when not using MKL, also
> we
> > > are running activation on CPU via MKL when we compile CUDNN+MKLDNN.
> > > I'm finishing a fix for these issues in the above PR.
> > >
> > > 2) https://github.com/apache/incubator-mxnet/issues/13438
> > > We are seeing crashes due to unsafe setenv in multithreaded code.
> > > Setenv / getenv from multiple threads is not safe and is causing
> > > segfaults. This piece of code (the handlers in pthread_atfork)
> > >> already
> > > caused a very difficult to diagnose hang in a previous release,
> where
> > > a fork inside cudnn would deadlock the engine.
> > >
> > > I would remove setenv from 2) as a mitigation, but we would need to
> > > check for regressions as we could be creating additional threads
> > > inside the engine.
> > >
> > > I would suggest that we address these two major issues before the
> > >> next
> > > release.
> > >
> > > Pedro
> > >
> > >
> > >
> > > On Sun, Nov 25, 2018 at 11:41 PM Steffen Rochel <
> > >>> steffenroc...@gmail.com>
> > > wrote:
> > >>
> > >> Dear MXNet community,
> > >>
> > >> I will be the release manager for the upcoming Apache MXNet 1.4.0
> > > release.
> > >> Sergey Kolychev will be co-managing the release and providing help
> > >>> from
> > > the
> > >> committers side.
> > >> A release candidate will be cut on November 29, 2018 and voting
> > >> will
> > > start
> > >> December 7, 2018. Release notes have been drafted here [1]. If you
> > >>> have
> > > any
> > >> additional features in progress and would like to include it in
> > >> this
> > >> release, please assure they have been merged by November 27, 2018.
> > > Release
> > >> schedule is available here [2].
> > >>
> > >> Feel free to add any other comments/suggestions. Please help to
> > >>> review
> > > and
> > >> merge outstanding PR's and resolve issues impacting the quality of
> > >>> the
> > >> 

Re: [Announce] Upcoming Apache MXNet (incubating) 1.4.0 release

2018-11-29 Thread Lin Yuan
https://github.com/apache/incubator-mxnet/pull/13452 is needed in 1.4.0 to
support Horovod integration project.

Thanks!

Lin


On Thu, Nov 29, 2018 at 1:40 PM Davydenko, Denis <
dzianis.davydze...@gmail.com> wrote:

> I suggest to include this issue into tracked ones for the release:
> https://github.com/apache/incubator-mxnet/issues/12255. It has proven to
> be a problem with MXNet start up time and it will cause even more problems
> down the line with Elastic Training, EIA where MXNet is a commodity rather
> than statically running process. Also it already causes noticeable issues
> with MMS (MXNet Model Server [1]). MMS users already noticed significant
> lag with MMS start up time, especially on beefy instances like C5.18xl with
> 72 vCPUs. MMS spins up multiple MXNet instances during its start up to
> ensure full utilization of CPU or GPU resources on the host. By default it
> spins up as many MXNet instances as there are cores (either CPU or GPU
> cores) and the bigger the host the more MXNet instances are spun up. And
> the more MXNet instances spun up - the more each instance takes time to
> start. For example, on C5.4xl users reported waiting for as long as 2
> minutes to have just 8 MXNet instances spun up with MXNet 1.3. Same efforts
> with MXNet 1.1 take less than 0.5 sec.
>
> This is quite a significant regression in MXNet when it comes to start up
> experience. I suggest to consider this as a blocker for 1.4.
>
> [1] https://github.com/awslabs/mxnet-model-server
>
> On 11/29/18, 12:51 PM, "Steffen Rochel"  wrote:
>
> added to 1.4.0 tracking list
> <
> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Plan+and+Status#ApacheMXNet(incubating)1.4.0ReleasePlanandStatus-OpenPRstotrack
> >
> .
> Steffen
>
> On Thu, Nov 29, 2018 at 9:32 AM Zheng, Da 
> wrote:
>
> > Hello Steffen,
> >
> > Can this bug be fixed in 1.4.0 release? It's a significant
> performance
> > regression on sparse matrix multiplication.
> > https://github.com/apache/incubator-mxnet/issues/13449
> >
> > Thanks,
> > Da
> >
> > On 11/26/18, 6:42 AM, "Steffen Rochel" 
> wrote:
> >
> > Dear MXNet community,
> >
> > I will be the release manager for the upcoming Apache MXNet 1.4.0
> > release.
> > Sergey Kolychev will be co-managing the release and providing
> help
> > from the
> > committers side.
> > A release candidate will be cut on November 29, 2018 and voting
> will
> > start
> > December 7, 2018. Release notes have been drafted here [1]. If
> you
> > have any
> > additional features in progress and would like to include it in
> this
> > release, please assure they have been merged by November 27,
> 2018.
> > Release
> > schedule is available here [2].
> >
> > Feel free to add any other comments/suggestions. Please help to
> review
> > and
> > merge outstanding PR's and resolve issues impacting the quality
> of the
> > 1.4.0 release.
> >
> > Regards,
> >
> > Steffen
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Notes
> >
> > [2]
> >
> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incubating%29+1.4.0+Release+Plan+and+Status
> >
> >
> >
> >
> > On Tue, Nov 20, 2018 at 7:15 PM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > Spoke too soon[1], looks like others have been adding Turing
> support
> > as
> > > well (thanks to those helping with this).  I believe there's
> still a
> > few
> > > changes we'd have to make to claim support though (mshadow
> CMake
> > changes,
> > > PyPi package creation tweaks).
> > >
> > > 1:
> > >
> > >
> >
> https://github.com/apache/incubator-mxnet/commit/2c3357443ec3d49a11e93c89f278264ce10c2f08
> > >
> > > On Tue, Nov 20, 2018 at 7:00 PM kellen sunderland <
> > > kellen.sunderl...@gmail.com> wrote:
> > >
> > > > Hey Steffen, I'd like to be able to merge this PR for
> version 1.4:
> > > > https://github.com/apache/incubator-mxnet/pull/13310 . It
> fixes a
> > > > regression in master which causes incorrect feature vectors
> to be
> > output
> > > > when using the TensorRT feature.  (Thanks to Nathalie for
> helping
> > me
> > > track
> > > > down the root cause of the issue).   I'm currently blocked
> on a CI
> > issue
> > > I
> > > > haven't seen before, but hope to have it resolved by EOW.
> > > >
> > > > One call-out I would make is that we currently don't support
> Turing
> > > > architecture (sm_75).  I've been slowly trying to add
> support, but
> 

[PROPOSAL] Large tensor support in MXNet

2018-12-02 Thread Lin Yuan
Dear Community,

As some of you may have already encountered, MXNet has a limitation in
supporting tensors of size greater than ~4.3 billion elements (2^32). The
root cause is because in MXNet backend 32-bit integer type is used as the
default integer data type for both computation and storage in many places.

Uplifting this limitation, however, is not simply replacing all 32-bit
integers by a larger data type (64-bit integer) as I detailed them in the
design proposal at
https://cwiki.apache.org/confluence/display/MXNET/Large+Tensor+Support. It
requires a systematic approach to address this problem in MXNet backend as
well as all APIs in different language bindings.

I will appreciate your suggestions in solving this problem systematically
and elegantly, as well as your help to support in different language
bindings other than Python.

Please add your comment in the design proposal or create tickets in the
JIRA epic:  https://issues.apache.org/jira/browse/MXNET-1184

Best Regards,

Lin


Re: CI impaired

2018-11-21 Thread Lin Yuan
Thanks for your efforts, Marco!

On Wed, Nov 21, 2018 at 4:02 PM Anirudh Subramanian 
wrote:

> Thanks for the quick response and mitigation!
>
> On Wed, Nov 21, 2018 at 3:55 PM Marco de Abreu
>  wrote:
>
> > Hello,
> >
> > today, CI had some issues and I had to cancel all jobs a few minutes ago.
> > This was basically caused by the high load that is currently being put on
> > our CI system due to the pre-release efforts for this Friday.
> >
> > It's really unfortunate that we just had outages of three core components
> > within the last two days - sorry about that!. To recap, we had the
> > following outages (which are unrelated to the parallel refactor of the
> > Jenkins pipeline):
> > - (yesterday evening) The Jenkins master ran out of disk space and thus
> > processed requests at reduced capacity
> > - (this morning) The Jenkins master got updated which broke our
> > autoscalings upscaling capabilities.
> > - (new, this evening) Jenkins API was irresponsive: Due to the high
> number
> > of jobs and a bad API design in the Jenkins REST API, the time-complexity
> > of a simple create or delete request was quadratic which resulted in all
> > requests timing out (that was the current outage). This resulted in our
> > auto scaling to be unable to interface with the Jenkins master.
> >
> > I have now made improvements to our REST API calls which reduced the
> > complexity from O(N^2) to O(1). The reason was an underlying redirect
> loop
> > in the Jenkins createNode and deleteNode REST API in combination with
> > unrolling the entire slave and job graph (which got quite huge during
> > extensive load) upon every single request. Since we had about 150
> > registered slaves and 1000 jobs in the queue, the duration for a single
> > REST API call rose to up to 45 seconds (we execute up to a few hundred
> > queries per auto scaling loop). This lead to our auto scaling timing out.
> >
> > Everything should be back to normal now. I'm closely observing the
> > situation and I'll let you know if I encounter any additional issues.
> >
> > Again, sorry for any caused inconveniences.
> >
> > Best regards,
> > Marco
> >
> > On Wed, Nov 21, 2018 at 5:10 PM Gavin M Bell 
> > wrote:
> >
> > > Yes, let me add to the kudos, very nice work Marco.
> > >
> > >
> > > "I'm trying real hard to be the shepherd." -Jules Winnfield
> > >
> > >
> > > > On Nov 21, 2018, at 5:04 PM, Sunderland, Kellen
> > >  wrote:
> > > >
> > > > Appreciate the big effort in bring the CI back so quickly.  Thanks
> > Marco.
> > > >
> > > > On Nov 21, 2018 5:52 AM, Marco de Abreu <
> marco.g.ab...@googlemail.com
> > .INVALID>
> > > wrote:
> > > > Thanks Aaron! Just for the record, the new Jenkins jobs were
> unrelated
> > to
> > > > that incident.
> > > >
> > > > If somebody is interested in the details around the outage:
> > > >
> > > > Due to a required maintenance (disk running full), we had to upgrade
> > our
> > > > Jenkins master because it was running on Ubuntu 17.04 (for an unknown
> > > > reason, it used to be 16.04) and we needed to install some packages.
> > > Since
> > > > the support for Ubuntu 17.04 was stopped, this resulted in all
> package
> > > > updates and installations to fail because the repositories were taken
> > > > offline. Due to the unavailable maintenance package and other issues
> > with
> > > > the installed OpenJDK8 version, we made the decision to upgrade the
> > > Jenkins
> > > > master to Ubuntu 18.04 LTS in order to get back to a supported
> version
> > > with
> > > > maintenance tools. During this upgrade, Jenkins was automatically
> > updated
> > > > by APT as part of the dist-upgrade process.
> > > >
> > > > In the latest version of Jenkins, some labels have been changed which
> > we
> > > > depend on for our auto scaling. To be more specific:
> > > >> Waiting for next available executor on mxnetlinux-gpu
> > > > has been changed to
> > > >> Waiting for next available executor on ‘mxnetlinux-gpu’
> > > > Notice the quote characters.
> > > >
> > > > Jenkins does not offer a better way than to parse these messages
> > > > unfortunately - there's no standardized way to express queue items.
> > Since
> > > > our parser expected the above message without quote signs, this
> message
> > > was
> > > > discarded.
> > > >
> > > > We support various queue reasons (5 of them to be exact) that
> indicate
> > > > resource starvation. If we run super low on capacity, the queue
> reason
> > is
> > > > different and we would still be able to scale up, but most of the
> cases
> > > > would have printed the unsupported message. This resulted in reduced
> > > > capacity (to be specific, the limit during that time was 1 slave per
> > > type).
> > > >
> > > > We have now fixed our autoscaling to automatically strip these
> > characters
> > > > and added that message to our test suite.
> > > >
> > > > Best regards,
> > > > Marco
> > > >
> > > > On Wed, Nov 21, 2018 at 2:49 PM Aaron Markham <
> > aaron.s.mark...@gmail.com
> > > >
> > > > 

Re: [DISCUSS] Build OSX builds in CI (possibly with TravisCI).

2018-09-18 Thread Lin Yuan
; > >
> > > > > > The job only compiles MXNet on Mac and currently does not run
> unit
> > > > tests
> > > > > -
> > > > > > we expect the overall execution duration to be around 6 minutes
> and
> > > > thus
> > > > > > faster than the full Jenkins pipeline. The status is set to "not
> > > > > required"
> > > > > > which means that it does not block merging if that job fails
> since
> > > the
> > > > > > pipeline is still in beta. But in general, it would be good if
> > > > committers
> > > > > > review the results in case the job shows a failure. Our last
> known
> > > > state
> > > > > is
> > > > > > that the pipeline works properly, but we will keep everybody up
> to
> > > date
> > > > > in
> > > > > > case we get aware of any problems.
> > > > > >
> > > > > > The next step will be integration of Python CPU unit tests. There
> > > will
> > > > > be a
> > > > > > separate email if we got an update on that manner.
> > > > > >
> > > > > > Special thanks to Kellen Sunderland for the contribution of this
> > > Travis
> > > > > CI
> > > > > > pipeline.
> > > > > >
> > > > > > Best regards,
> > > > > > Marco
> > > > > >
> > > > > > On Wed, Sep 5, 2018 at 8:19 PM Tianqi Chen <
> > tqc...@cs.washington.edu
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Alrite, then I think it is fine as long as we can kept up with
> > > build
> > > > > > speed
> > > > > > > without timeout.
> > > > > > >
> > > > > > >
> > > > > > > Tianqi
> > > > > > >
> > > > > > > On Wed, Sep 5, 2018 at 9:14 AM kellen sunderland <
> > > > > > > kellen.sunderl...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Travis actually has explicit support for ccache, it's a
> > platform
> > > > > > feature.
> > > > > > > > I've run it and it seems to work quite well.  See for example
> > > this
> > > > > > build:
> > > > > > > >
> > > > > >
> > > >
> > https://travis-ci.org/KellenSunderland/incubator-mxnet/builds/424768656
> > > > > > > >
> > > > > > > > On Wed, Sep 5, 2018 at 7:10 PM Tianqi Chen <
> > > > tqc...@cs.washington.edu
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Travis it self is stateless, which means ccache is not
> likely
> > > > going
> > > > > > to
> > > > > > > > > work. As far as I understand, if jenkins master is in the
> > > public
> > > > > > > domain,
> > > > > > > > > you do not need to setup a vpn to the subset of the master.
> > > > > > > > >
> > > > > > > > > As for versions of MacOS, we are likely going to be fine
> with
> > > one
> > > > > > > > version,
> > > > > > > > > as usually the problems exhibits on mac are similar
> > > > > > > > >
> > > > > > > > > Tianqi
> > > > > > > > > On Wed, Sep 5, 2018 at 9:04 AM kellen sunderland <
> > > > > > > > > kellen.sunderl...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > @Tianqi: Yeah there's going to be a lot of trade-offs to
> > > using
> > > > > > > > Travis.  I
> > > > > > > > > > hope we can get it running fast enough with ccache that
> it
> > > > won't
> > > > > > > > timeout
> > > > > > > > > > when running tests, but even that is questionable.  In my
> > > > private
> > > > > > > > testing
> > > > > > > > > > it was running in about 35 minutes and the global timeout
> > for
> > > > > > Travis
> > > &g

Re: [Announcement] New Committer - Alex Zai

2019-03-31 Thread Lin Yuan
Congrats, Alex! Hope your book is going well :)

Lin

On Sun, Mar 31, 2019 at 6:18 PM Zhao, Patric  wrote:

> Congratulation, Alex.
>
> Thank you to your helps for MKLDNN backend including tests, coverage, CI :)
>
> Looking forward more cooperation together.
>
>
> > -Original Message-
> > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > Sent: Monday, April 1, 2019 8:56 AM
> > To: dev@mxnet.incubator.apache.org
> > Cc: Alex Zai 
> > Subject: Re: [Announcement] New Committer - Alex Zai
> >
> > Congratulation Alex!
> >
> > On Sun, Mar 31, 2019 at 4:17 PM Carin Meier 
> > wrote:
> >
> > > Welcome and congrats!
> > >
> > > On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian <
> > > anirudh2...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Please join me to welcome Alex Zai as a new committer of Apache
> > > > (incubating) MXNet!
> > > >
> > > > Alex has been instrumental in brining MKLDNN from experimental to
> > > > making
> > > it
> > > > default on MXNet master. This involved adding Python and C++ unit
> > > > tests, improving CI coverage for MKLDNN, testing MKLDNN on different
> > > > platforms
> > > and
> > > > working on issues related to MKLDNN.
> > > >
> > > > PRs:
> > > >
> > > >
> > > https://github.com/apache/incubator-
> > mxnet/pulls?utf8=%E2%9C%93=is%3A
> > > pr+author%3Aazai91+
> > > >
> > > > Issues:
> > > >
> > > >
> > > https://github.com/apache/incubator-
> > mxnet/issues?utf8=%E2%9C%93=is%3
> > > Aissue+involves%3Aazai91
> > > >
> > > > Reviews:
> > > >
> > > >
> > > https://github.com/apache/incubator-
> > mxnet/pulls?page=1=is%3Apr+revie
> > > wed-by%3Aazai91=%E2%9C%93
> > > >
> > > > Dev:
> > > > https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai9
> > > > 1
> > > >
> > > > Thanks,
> > > >
> > > > Anirudh
> > > >
> > >
>


Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-03-22 Thread Lin Yuan
+1.

Just to give some of my real experience:
1) I advertised a recent GluonNLP blog and many responses are "This seems
nice. So is Gluon a new library to replace MXNet?"
2) We visited customers in a unicorn company who showed interests in MXNet
but none of the engineers knew the relationship between GluonNLP/GluonCV
and MXNet
3) When integrating MXNet to Horovod and adding examples, I received
comments like "What is Gluon? Is it a new library in addition to MXNet?"

Everyone is talking about PyTorch nowadays, but not Caffe2 anymore although
the latter is still serving as a backend component. Maybe we should also
doubledown on one brand?

Lin

On Fri, Mar 22, 2019 at 4:02 PM Pedro Larroy 
wrote:

> Hi dev@
>
> We heard feedback from users that the Gluon name is confusing. Some of
> them don't even know it's MXNet and it's unclear the relationship with
> MXNet
>
> Would it make sense to rebrand Gluon to just MXNet or MXNet
> imperative? Diluting brands and names is never a good idea.
>
> There's also gluonhq which is related to JavaFX which adds to the
> confusion, search engine friendliness is not high as well.
>
> Pedro.
>


Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-03-22 Thread Lin Yuan
@Junru GluonNLP and GluonCV are definitely awesome toolkits. I feel we
should advertise more about these hidden treasures :)

Today there is a big initiative to publicize MXNet. I feel we should also
bring GluonNLP and GluonCV on the same boat and highlight their tight
relations with MXNet.

My two cents.

Lin



On Fri, Mar 22, 2019 at 6:08 PM Junru Shao  wrote:

> Probably we should figure out how to explain MXNet Gluon to customers. In
> this case, I agree with @Mu that
>
> 1) MXNet Gluon provides high-level API like what Keras gives to TensorFlow.
>
> 2) MXNet Gluon supports hybridization, which unifies both symbolic and
> imperative programming style.
>
> Also, about toolkits, we could mention
>
> 3) GluonNLP and GluonCV are two awesome libraries in their respective
> domain, both of which are built on MXNet Gluon. They not only provide an
> awesome exemplary codebase for customers to learn the best way to use MXNet
> Gluon, but also come with the state-of-the-art models and training
> techniques out-of-the-box.
>
> Any other ideas?
>
>
> On Fri, Mar 22, 2019 at 5:54 PM Pedro Larroy  >
> wrote:
>
> > +1 to MXNet Gluon given the feedbacks and explanations from everyone so
> > far.
> >
> > On Fri, Mar 22, 2019 at 5:09 PM Junru Shao 
> > wrote:
> > >
> > > I feel like MXNet Gluon is a good name. You don't lose customers who
> have
> > > been familiar with MXNet, nor lose customers who are used to MXNet
> > symbolic.
> > >
> > > On Fri, Mar 22, 2019 at 5:07 PM Davydenko, Denis <
> > > dzianis.davydze...@gmail.com> wrote:
> > >
> > > > As subject suggests this is a proposal for re-branding of Gluon to
> > align
> > > > it with MXNet. One of the common things undertaken for re-branding
> > > > exercises is renaming. That's what my thinking behind suggesting new
> > name
> > > > for Gluon. I am sincerely curious what would be alternatives to
> rebrand
> > > > Gluon to align it with MXNet without changing its name.
> > > >
> > > >
> > > > On 3/22/19, 4:57 PM, "Mu Li"  wrote:
> > > >
> > > > Are you proposing to rename Gluon? I think Pedro's opinion is
> > about a
> > > > better way to communicate what's Gluon and how it's related to
> > MXNet.
> > > >
> > > > On Fri, Mar 22, 2019 at 4:54 PM Davydenko, Denis
> > > > 
> > > > wrote:
> > > >
> > > > > I support idea of putting brands of MXNet and Gluon closer
> > together.
> > > > I
> > > > > agree with your argument, Mu, but MXNet is quite far away from
> TF
> > > > place at
> > > > > this time so I don’t know how well that argument is
> transferable
> > > > from TF
> > > > > position to MXNet position.
> > > > >
> > > > > MXNet Imperative is definitely too restrictive of a name, we
> can
> > > > come up
> > > > > with better one... MXNet-M for example, stands for
> MXNet-Modified
> > > > (military
> > > > > connotation). If naming is the only thing we need to figure
> out -
> > > > that is a
> > > > > good place to be in __
> > > > >
> > > > > --
> > > > > Thanks,
> > > > > Denis
> > > > >
> > > > > On 3/22/19, 4:48 PM, "Mu Li"  wrote:
> > > > >
> > > > > Gluon is about imperative neural network training and data
> > > > loading.
> > > > > ndarray
> > > > > is another large imperative module. Besides, Gluon also
> > supports
> > > > > symbolic
> > > > > execution after hybridizing.  mxnet imperative might not
> be a
> > > > good
> > > > > name for
> > > > > it. Another choice is high-level API, that's how TF talks
> > about
> > > > Keras.
> > > > >
> > > > > On Fri, Mar 22, 2019 at 4:38 PM Yuan Tang <
> > > > terrytangy...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > On Fri, Mar 22, 2019 at 7:29 PM Lin Yuan <
> > apefor...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > &g

Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-03-22 Thread Lin Yuan
@Junru I fully agree with what you said. What I meant is we need to make
more customers know about them.

Lin

On Fri, Mar 22, 2019 at 6:34 PM Junru Shao  wrote:

> @Lin I believe that the way to build a healthy community is to make both
> customers and developers happy. In this case, I feel like the more
> important thing about toolkits is to explain how useful they are to our
> customers, rather than positions, components or anything else.
>
> As I mentioned above, the usefulness comes from two aspects (at least).
>
> 1) they provide state-of-the-art models and training techniques
> out-of-the-box. If our customers want inference only, we have model zoo; If
> our customers want to train on their own dataset, we have awesome training
> tricks enclosed.
>
> 2) it provides exemplary codebase for anyone who wants to use Gluon
> elegantly. It does help a lot for real-world development, compared with
> simplest examples like tutorial.
>
>
> On Fri, Mar 22, 2019 at 6:07 PM Junru Shao 
> wrote:
>
> > Probably we should figure out how to explain MXNet Gluon to customers. In
> > this case, I agree with @Mu that
> >
> > 1) MXNet Gluon provides high-level API like what Keras gives to
> TensorFlow.
> >
> > 2) MXNet Gluon supports hybridization, which unifies both symbolic and
> > imperative programming style.
> >
> > Also, about toolkits, we could mention
> >
> > 3) GluonNLP and GluonCV are two awesome libraries in their respective
> > domain, both of which are built on MXNet Gluon. They not only provide an
> > awesome exemplary codebase for customers to learn the best way to use
> MXNet
> > Gluon, but also come with the state-of-the-art models and training
> > techniques out-of-the-box.
> >
> > Any other ideas?
> >
> >
> > On Fri, Mar 22, 2019 at 5:54 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> > wrote:
> >
> >> +1 to MXNet Gluon given the feedbacks and explanations from everyone so
> >> far.
> >>
> >> On Fri, Mar 22, 2019 at 5:09 PM Junru Shao 
> >> wrote:
> >> >
> >> > I feel like MXNet Gluon is a good name. You don't lose customers who
> >> have
> >> > been familiar with MXNet, nor lose customers who are used to MXNet
> >> symbolic.
> >> >
> >> > On Fri, Mar 22, 2019 at 5:07 PM Davydenko, Denis <
> >> > dzianis.davydze...@gmail.com> wrote:
> >> >
> >> > > As subject suggests this is a proposal for re-branding of Gluon to
> >> align
> >> > > it with MXNet. One of the common things undertaken for re-branding
> >> > > exercises is renaming. That's what my thinking behind suggesting new
> >> name
> >> > > for Gluon. I am sincerely curious what would be alternatives to
> >> rebrand
> >> > > Gluon to align it with MXNet without changing its name.
> >> > >
> >> > >
> >> > > On 3/22/19, 4:57 PM, "Mu Li"  wrote:
> >> > >
> >> > > Are you proposing to rename Gluon? I think Pedro's opinion is
> >> about a
> >> > > better way to communicate what's Gluon and how it's related to
> >> MXNet.
> >> > >
> >> > > On Fri, Mar 22, 2019 at 4:54 PM Davydenko, Denis
> >> > > 
> >> > > wrote:
> >> > >
> >> > > > I support idea of putting brands of MXNet and Gluon closer
> >> together.
> >> > > I
> >> > > > agree with your argument, Mu, but MXNet is quite far away from
> >> TF
> >> > > place at
> >> > > > this time so I don’t know how well that argument is
> transferable
> >> > > from TF
> >> > > > position to MXNet position.
> >> > > >
> >> > > > MXNet Imperative is definitely too restrictive of a name, we
> can
> >> > > come up
> >> > > > with better one... MXNet-M for example, stands for
> >> MXNet-Modified
> >> > > (military
> >> > > > connotation). If naming is the only thing we need to figure
> out
> >> -
> >> > > that is a
> >> > > > good place to be in __
> >> > > >
> >> > > > --
> >> > > > Thanks,
> >> > > > Denis
> >> > > >
> >> > > > On 3/22/19, 4:48 PM, "Mu Li"  wrote:
> >> > 

Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-03-22 Thread Lin Yuan
@Junru Thanks for the clarification. Given that we already have courseware
and books with Gluon, it makes sense to brand “Mxnet Gluon” with Gluon
being the high level API of mxnet

@Tianqi what’s the roadmap of GluonNLP/GluonCV? Are they positioned to be
high level API of MXnet or some plug-and-play components that could
potentially be put on top of other frameworks in the future? If the former,
should we always highlight Mxnet whenever we advertise GluonNLP?

Thanks

Lin

On Fri, Mar 22, 2019 at 5:41 PM Tianqi Chen 
wrote:

> Change the name gluon will result in a significant problem of backward
> compatibility for many of the current users, and that would be a huge -1
> for the current community.
> One possibility is to do that is to have a clear roadmap of 2.0(which gives
> the message of non-backward compatible) and we can discuss which features
> consolidate, but perhaps that will require a bit more thoughts and
> coordinated effort.
>
> Tianqi
>
> On Fri, Mar 22, 2019 at 5:39 PM Junru Shao 
> wrote:
>
> > @Tianqi For sure GluonCV and GluonNLP should go with the current name. No
> > reason to change.
> >
> > @Lin If customers are interested, I guess we could say they are awesome
> > toolkits built on top of MXNet Gluon API, and perfect illustration to
> write
> > clever and powerful code on the top of it.
> >
>


[RFC] Higher order gradient support in MXNet

2019-04-04 Thread Lin Yuan
Dear Community,

Higher order gradient calculation is required for many applications.
However, current MXNet only support higher order gradient for a very
limited number of operators.

We plan to support the higher order gradient calculation in the autograd
package. A design proposal is ready for review:
https://cwiki.apache.org/confluence/display/MXNET/Higher+Order+Gradient+Calculation

We will appreciate any help and feedbacks from the community.

Cheers!

Lin


Re: [DISCUSS] Process to remove deprecated operators

2019-02-27 Thread Lin Yuan
Sheng,

Thanks for your quick response.
If that's the case, we will wait till 2.0 release to remove the deprecated
operators from code.

Best,
Lin

On Wed, Feb 27, 2019 at 9:06 PM Sheng Zha  wrote:

> MXNet follows semantic versioning so we will be able to delete them in the
> next major release.
>
> -sz
>
> On Wed, Feb 27, 2019 at 8:53 PM Lin Yuan  wrote:
>
> > Dear Community,
> >
> > In MXNet there are many legacy operators such as this
> > <
> >
> http://mxnet.incubator.apache.org/versions/master/api/python/symbol/symbol.html?highlight=convolution_v1#mxnet.symbol.Convolution_v1
> > >
> > that has been marked DEPRECATE for several releases. However, these
> > operators still exist in our code. This caused a few problems:
> >
> > 1) Make the codebase bloated and reduce readability
> > 2) Increase unnecessary maintanence effort
> > 3) Bug prone as some people will look up these legacy code as example
> > 4) Cause confusion to end users and make documentation page lengthy
> >
> > I would like to propose the following process (if there is no existing
> one)
> > to remove deprecate operators from our code base.
> >
> > 1. Documnent the deprecate operators/environment variables in the release
> > note as well as man pages.
> > 2. Limit the life cycle of deprecate operators/argument to two minor
> > release. For example, if one operator is marked deprecate in 1.4 release,
> > it will be removed in 1.6 release.
> > 3. If there is some concern raised from customers during 1.4 and 1.5
> > release, we can convert the deprecated operator back to current and it
> will
> > be treated as new operator.
> > 4. PRs that remove deprecate operators should contain [Cleanup] in title.
> >
> > Any comment is appreciated.
> >
> > Lin
> >
>


[DISCUSS] Process to remove deprecated operators

2019-02-27 Thread Lin Yuan
Dear Community,

In MXNet there are many legacy operators such as this

that has been marked DEPRECATE for several releases. However, these
operators still exist in our code. This caused a few problems:

1) Make the codebase bloated and reduce readability
2) Increase unnecessary maintanence effort
3) Bug prone as some people will look up these legacy code as example
4) Cause confusion to end users and make documentation page lengthy

I would like to propose the following process (if there is no existing one)
to remove deprecate operators from our code base.

1. Documnent the deprecate operators/environment variables in the release
note as well as man pages.
2. Limit the life cycle of deprecate operators/argument to two minor
release. For example, if one operator is marked deprecate in 1.4 release,
it will be removed in 1.6 release.
3. If there is some concern raised from customers during 1.4 and 1.5
release, we can convert the deprecated operator back to current and it will
be treated as new operator.
4. PRs that remove deprecate operators should contain [Cleanup] in title.

Any comment is appreciated.

Lin


Re: [DISCUSS] Process to remove deprecated operators

2019-02-27 Thread Lin Yuan
Agreed. When we deprecate an operator, we should add in the log message
something like "This operator X is deprecate and will be removed in the
next release. Please use operator Y instead."

Lin

On Wed, Feb 27, 2019 at 10:23 PM Junru Shao  wrote:

> Hi Lin,
>
> I would love to share some immature ideas about deprecating operators. Not
> only adopting semantic versioning, but also should we provide enough
> informative error message for customers to understand how to replace
> deprecated operators with new ones.
>
> Thanks,
> Junru
>
> On Wed, Feb 27, 2019 at 9:30 PM Lin Yuan  wrote:
>
> > Sheng,
> >
> > Thanks for your quick response.
> > If that's the case, we will wait till 2.0 release to remove the
> deprecated
> > operators from code.
> >
> > Best,
> > Lin
> >
> > On Wed, Feb 27, 2019 at 9:06 PM Sheng Zha  wrote:
> >
> > > MXNet follows semantic versioning so we will be able to delete them in
> > the
> > > next major release.
> > >
> > > -sz
> > >
> > > On Wed, Feb 27, 2019 at 8:53 PM Lin Yuan  wrote:
> > >
> > > > Dear Community,
> > > >
> > > > In MXNet there are many legacy operators such as this
> > > > <
> > > >
> > >
> >
> http://mxnet.incubator.apache.org/versions/master/api/python/symbol/symbol.html?highlight=convolution_v1#mxnet.symbol.Convolution_v1
> > > > >
> > > > that has been marked DEPRECATE for several releases. However, these
> > > > operators still exist in our code. This caused a few problems:
> > > >
> > > > 1) Make the codebase bloated and reduce readability
> > > > 2) Increase unnecessary maintanence effort
> > > > 3) Bug prone as some people will look up these legacy code as example
> > > > 4) Cause confusion to end users and make documentation page lengthy
> > > >
> > > > I would like to propose the following process (if there is no
> existing
> > > one)
> > > > to remove deprecate operators from our code base.
> > > >
> > > > 1. Documnent the deprecate operators/environment variables in the
> > release
> > > > note as well as man pages.
> > > > 2. Limit the life cycle of deprecate operators/argument to two minor
> > > > release. For example, if one operator is marked deprecate in 1.4
> > release,
> > > > it will be removed in 1.6 release.
> > > > 3. If there is some concern raised from customers during 1.4 and 1.5
> > > > release, we can convert the deprecated operator back to current and
> it
> > > will
> > > > be treated as new operator.
> > > > 4. PRs that remove deprecate operators should contain [Cleanup] in
> > title.
> > > >
> > > > Any comment is appreciated.
> > > >
> > > > Lin
> > > >
> > >
> >
>


Re: Call for Ideas and Approaches to Community Building

2019-03-17 Thread Lin Yuan
Zach,

Thanks for joining in the mxnet project and your very thoughtful
discussion. We do have virtual hangout/meetups. Please refer to
https://cwiki.apache.org/confluence/display/MXNET/Meetups+and+Hangouts

I also strongly agree with your 4). I think we should have a clear roadmap
on our wiki page and/or github repo.

Again, welcome on board!

Lin


On Sun, Mar 17, 2019 at 7:33 AM Zhao, Patric  wrote:

> Very great points!
>
> +1 for 4) and 5)
>
>
> > -Original Message-
> > From: Zach Boldyga [mailto:z...@scalabull.com]
> > Sent: Sunday, March 17, 2019 8:33 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: Call for Ideas and Approaches to Community Building
> >
> > This is a great discussion, thanks for opening, Carin!
> >
> > As a newcomer to MXNet and Apache communities in general, I’ve been
> > considering what I can bring to the table here, and what importance it
> would
> > have to me.
> >
> > I'm not employed by large organizations, and communities like this are
> > perhaps the only way to be involved in projects of such a large scale and
> > importance. An opportunity to join this type of team without the full
> > commitment of employment is fantastic! I see potential for this to be a
> form
> > of validation, a chance to meet others and build professional
> relationships,
> > and a vehicle to learn from some of the most well-educated people in the
> > industry.
> >
> > That said, here’s what I’ve noticed thus far:
> >
> > 1. There is a healthy amount of activity in Github Issues, and the
> committers
> > are doing a great job at allowing newcomers to jump in. I was able to get
> > started on my first ticket within 10 minutes of searching thru issues.
> >
> > 2. The dev mailing list is a great place to discuss all of the nuances
> of the
> > project. I also like meeting people and it would be rewarding to get to
> know
> > people in the community via Skype or in-person meetups! This doesn’t have
> > to be for everyone, and I don’t think it’s appropriate for Q, but for
> some
> > people a social element purely for the sake of putting names with faces
> can
> > be rewarding. I’m open to virtual meetups :)
> >
> > 3. My first commit was smooth. When approaching the second one, I’m
> > hitting some hiccups. For instance, I recently created a JIRA ticket
> based on a
> > Github Issue some users reported, and the ticket has been sitting for a
> week
> > without any activity. Should I just dig in and open a PR? How do the
> > commiters decide what can and can’t reasonably go into the project? We
> > may be able to make some changes to the contribution documentation or
> > processes to make it easier for first time contributors to ramp-up into
> regular
> > contributors?
> >
> > 4. I would love to see more discussion about the future of MXNet. I
> imagine
> > those who have been involved in the project for a long time have thoughts
> > about next major steps, but as an outsider I’m not sure where to find
> this
> > information. The roadmap on Github is fairly short-term and outdated, and
> > lots of interesting ideas are sprouting in projects like TF Swift as of
> 2019.
> >
> > 5. Something I’ve observed across many Apache projects: there isn’t much
> > focus on marketing. I wonder why? A tool like Tensorflow is reaching 10x
> > more people, mainly because of marketing.
> >
> > Best,
> >
> > Zach Boldyga
> > Scalabull  |  Founder
> > 1 (866) 846-8771 x 101
> >
> >
> > On Thu, Mar 7, 2019 at 5:38 AM Tianqi Chen 
> > wrote:
> >
> > > what happens (also) happens in the mail-list.
> > >
> > > If there is a certain things or person’s contribution is only known by
> > > colleagues, it is a indication of things that should be improved
> > > toward more apache way.
> > >
> > > Tianqi
> > >
> > > On Thu, Mar 7, 2019 at 4:42 AM Isabel Drost-Fromm 
> > > wrote:
> > >
> > > > On Wed, Mar 06, 2019 at 10:03:57PM -0800, Steffen Rochel wrote:
> > > > > I agree with Tianqi on "One approach toward building a more
> > > > > diverse community is to acknowledge the fact that we want to
> > > > > encourage
> > > > interactions
> > > > > in the Apache way beyond our physical cycle." However, I disagree
> > > > > with
> > > > his
> > > > > suggestion regarding "One principle to toward that is to encourage
> > > > > PMC members only nominate committers from other organizations" for
> > > > > the following reasons: [...]
> > > >
> > > > I spent quite some time digging remembering that a similar topic had
> > > > been discussed somewhere at the ASF at some point in time with many
> > > > whys, pros and cons towards contributor employer diversity - finally
> > > > found a long and winding thread there:
> > > >
> > > >
> > > >
> > >
> > https://lists.apache.org/thread.html/7a7412316ddbe1d43f5fb3d3703ea25a6
> > >
> > b26e56de602e27e175785c0@1337815698@%3Cgeneral.incubator.apache.or
> > g%3E
> > > >
> > > >
> > > > There is one answer in there from Roy Fielding which has a similar
> > > > story to the one that you are 

Re: [Announcement] New Committer - Patric Zhao

2019-03-21 Thread Lin Yuan
Congrats, Patric!

On Thu, Mar 21, 2019 at 10:32 AM Yuxi Hu  wrote:

> Congrats, Patric! Well deserved!
>
> On Wed, Mar 20, 2019 at 1:08 PM kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
>
> > Congrats Patric!
> >
> > On Sun, Mar 17, 2019 at 10:34 PM Hagay Lupesko 
> wrote:
> >
> > > Congrats Patric!
> > >
> > > On Fri, Mar 15, 2019 at 7:49 AM Joshua Z. Zhang 
> > > wrote:
> > >
> > > >
> > > >
> > > >
> > > >  Congrats Patrick!
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >  Zhi
> > > >
> > > > >
> > > > > On Mar 15, 2019 at 10:46 PM,   > > > marco.g.ab...@gmail.com)>  wrote:
> > > > >
> > > > >
> > > > >
> > > > >  Congratulations, great to have you on board!
> > > > >
> > > > > -Marco
> > > > >
> > > > > Lv, Tao Aschrieb am Fr., 15. März 2019,
> > 15:38:
> > > > >
> > > > > >  Wow, congratulation Patric!
> > > > > >
> > > > > >  -Original Message-
> > > > > >  From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > > > > >  Sent: Friday, March 15, 2019 10:25 PM
> > > > > >  To: dev@mxnet.incubator.apache.org
> > > > > >  Cc: patric zhao  
> > > > > >  Subject: Re: [Announcement] New Committer - Patric Zhao
> > > > > >
> > > > > >  Congratulation Patrick!
> > > > > >  Steffen
> > > > > >
> > > > > >  On Fri, Mar 15, 2019 at 5:38 AM Zhao, Patric  <
> > > patric.z...@intel.com>
> > > >
> > > > > >  wrote:
> > > > > >
> > > > > >   >  I am very glad to have this opportunity to contribute to the
> > > > > >   >  Apache/MXNet community :)
> > > > > >   >
> > > > > >   >  Thanks all of the supports from the community and Intel.
> > > > > >   >
> > > > > >   >  BR,
> > > > > >   >
> > > > > >   >  --Patric
> > > > > >   >
> > > > > >   >
> > > > > >   >   >  -Original Message-
> > > > > >   >   >  From: MiraiWK WKCN [mailto:w...@live.cn]
> > > > > >   >   >  Sent: Friday, March 15, 2019 12:52 AM
> > > > > >   >   >  To: dev@mxnet.incubator.apache.org; patric zhao
> > > > > >   >   >   
> > > > > >   >   >  Subject: Re: [Announcement] New Committer - Patric Zhao
> > > > > >   >   >
> > > > > >   >   >  Welcome Peng Zhao!
> > > > > >   >   >  Peng is the AI Tech Leader in Intel Corporation. We have
> > > > good
> > > > > >   >   >  cooperation before. He is very professional and
> > contribute a
> > > > lot to
> > > > > >   >   >  MXNet,
> > > > > >   >  especially deep
> > > > > >   >   >  learning boost on CPU.
> > > > > >   >   >
> > > > > >   >   >  
> > > > > >   >   >  From: Anirudh Subramanian  
> > > > > >   >   >  Sent: Thursday, March 14, 2019 3:54:50 PM
> > > > > >   >   >  To: dev@mxnet.incubator.apache.org; patric zhao
> > > > > >   >   >  Subject: [Announcement] New Committer - Patric Zhao
> > > > > >   >   >
> > > > > >   >   >  Hi all,
> > > > > >   >   >
> > > > > >   >   >  Please join me to welcome Patric Zhao as a new committer
> > of
> > > > Apache
> > > > > >   >   >  (incubating) MXNet!
> > > > > >   >   >
> > > > > >   >   >  Patric has put in great effort around MKLDNN integration
> > > into
> > > > MXNet
> > > > > >   >   >  and
> > > > > >   >  has
> > > > > >   >   >  been involved in features like quantization, graph
> fusion
> > > and
> > > > fused
> > > > > >   >   >  RNN operators for CPU.
> > > > > >   >   >
> > > > > >   >   >  Dev List activity:
> > > > > >   >   >
> > > > > >   >
> > > >
> https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:patric.
> > > > > >   >  zhao
> > > > > >   >   >
> > > > > >   >   >  Issues:
> > > > > >   >   >  https://github.com/apache/incubator-
> > > > > >   >   >
> > > > mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Apengzhao-intel+
> > > > > >   >   >
> > > > > >   >   >  PR Reviews:
> > > > > >   >   >  https://github.com/apache/incubator-
> > > > > >   >   >
> > > > mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Apengzhao-intel
> > > > > >   >   >
> > > > > >   >   >  Proposals involved in:
> > > > > >   >   >
> > > > https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimi
> > > > > >   >   >  z
> > > > > >   >   >  ation+and+Quantization+based+on+subgraph+and+MKL-DNN
> > > > > >   >   >
> > > > https://cwiki.apache.org/confluence/display/MXNET/Fused+RNN+Operator
> > > > > >   >   >  s
> > > > > >   >   >  +for+CPU
> > > > > >   >   >   <
> > > > https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optim
> > > > > >   >   >  i
> > > > > >   >   >  zation+and+Quantization+based+on+subgraph+and+MKL-DNN>
> > > > > >   >   >
> > > > > >   >   >
> > > > > >   >   >  Thanks,
> > > > > >   >   >  Anirudh
> > > > > >   >
> > > > > >
> > > > >
> > >
> >
>
>
> --
> Yuxi(Darren) Hu, Ph.D.
> Software Development Engineer
> Amazon Web Services
>


Re: [Announce] Runtime feature detection

2019-02-12 Thread Lin Yuan
Thanks, Pedro for contributing this long awaiting feature. I can
immediately use it for Horovod project now.

Bravo!

Lin

On Tue, Feb 12, 2019 at 2:42 AM Pedro Larroy 
wrote:

> An update on this topic, Sheng just merged the refinements to the
> feature detection so it's now a single API call. (
> https://github.com/apache/incubator-mxnet/pull/13964 ). Thank you
> Sheng for the reviews.
>
> Please use this functionality to check for capabilities of MXNet at
> runtime such as Cuda, OpenCV etc. This can simplify tests and
> automation in several places in the code.
>
> Lin Iblis is already preparing Julia support:
> https://github.com/apache/incubator-mxnet/pull/13992
>
> This is a PR that adds documentation on the feature and explains how
> to use it from Python:
> https://github.com/apache/incubator-mxnet/pull/14130
>
> Thanks.
>
> On Fri, Jan 25, 2019 at 7:08 PM Sheng Zha  wrote:
> >
> > Hi Pedro,
> >
> > Happy to help, though I was waiting for PR comments to be addressed.
> Currently the PR is close to complete, with some open comments to be
> resolved.
> >
> > -sz
> >
> > > On Jan 25, 2019, at 9:27 AM, Pedro Larroy <
> pedro.larroy.li...@gmail.com> wrote:
> > >
> > > That's Great! There's a PR that we should merge first which
> > > internalizes the enum inside the library as per Sheng's suggestion.
> > >
> > > https://github.com/apache/incubator-mxnet/pull/13964
> > >
> > > @Sheng could we merge the PR? so we can build on top of this feature?
> > > It's badly needed for tests suites etc.
> > > Thanks a lot!
> > >
> > > Pedro.
> > >
> > >
> > >> On Fri, Jan 25, 2019 at 2:22 PM Iblis Lin 
> wrote:
> > >>
> > >> Hi,
> > >>
> > >> I added the Julia binding for it.
> > >> PR is here:
> > >> https://github.com/apache/incubator-mxnet/pull/13992
> > >>
> > >> Iblis Lin
> > >> 林峻頤
> > >>
> > >>> On 1/23/19 12:39 AM, Pedro Larroy wrote:
> > >>> Hi
> > >>>
> > >>> I'm pleased to announce that runtime feature detection has been
> merged
> > >>> in master, thanks to Aaron for the merge and the many reviewers who
> > >>> gave feedback on the PR.  (
> > >>> https://github.com/apache/incubator-mxnet/pull/13549 )
> > >>>
> > >>> As the functionality matures and is exposed through other bindings,
> > >>> please feel free to try and use it to build on it, for example for
> > >>> easier test suite selection depending on what's compiled in the
> > >>> engine.
> > >>>
> > >>> Usage examples:
> > >>>
> > >>> $ ipython
> > >>> In [4]: import mxnet.mxfeatures
> > >>>
> > >>> In [5]: mxnet.mxfeatures.features_enabled()
> > >>> Out[5]:
> > >>> [,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ,
> > >>> ]
> > >>>
> > >>> In [6]: mxnet.mxfeatures.features_enabled_str()
> > >>> Out[6]: 'CPU_SSE, CPU_SSE2, CPU_SSE3, CPU_SSE4_1, CPU_SSE4_2,
> CPU_AVX,
> > >>> F16C, BLAS_OPEN, LAPACK, SIGNAL_HANDLER, DEBUG'
> > >>>
> > >>> see also: help(mxnet.mxfeatures)
> > >>>
> > >>> Regards.
> > >>>
>


Re: [RESTARTING][VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-11 Thread Lin Yuan
+1 binding
Horovod is going to release it's 0.16.0 in the coming week with MXNet
integration. We need to release 1.4.0 which includes all the dependencies
for Horovod integration.

Best,

Lin

On Mon, Feb 11, 2019 at 9:30 PM Steffen Rochel 
wrote:

> Dear community -
> based on Justin's and community feedback I'm suggesting to restart the
> vote.
> Current status:
> binding votes:
> +1: 2 votes (Henri, Jason)
> -1:  1 vote (Luciano)
>
> non-binding:
> +1: 1 vote (Kellen)
>
> The community is investigating feedback from Luciano that the exclusion
> file is to broad and potentially missing files which can and must have
> apache license headers not to be checked.
>
> Regards,
> Steffen
>
>
>
>
> On Mon, Feb 11, 2019 at 10:08 AM Hagay Lupesko  wrote:
>
> > Based on Justin's feedback, can we resume the vote instead of cancelling
> > it?
> >
> > On Mon, Feb 11, 2019 at 12:02 AM Justin Mclean  >
> > wrote:
> >
> > > Hi,
> > >
> > > In future don’t be so hasty to cancel a release vote, people mind can
> be
> > > changed and a -1 is not a veto on a release.
> > >
> > > Thanks,
> > > Justin
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > For additional commands, e-mail: general-h...@incubator.apache.org
> > >
> > >
> >
>


Re: [Announcement] New Committer - Nicolas Modrzyk

2019-02-15 Thread Lin Yuan
Welcome, Nicolas! Good to have you on board.

Lin

On Fri, Feb 15, 2019 at 8:03 AM Carin Meier  wrote:

> Please join me in welcoming Nicolas Modrzyk, (@hellonico), as a new
> committer.
>
> He has made valuable contributions to the Clojure package, especially in
> the areas of stability with integration tests and visualizations [1].
>
> We are excited to have him with us as a committer and look forward to
> future growth of the MXNet Clojure package and community.
>
> - Carin
>
>
> [1]
>
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+hellonico+
>


Re: Horovod-MXNet Integration

2019-01-30 Thread Lin Yuan
Hi Yuan,

Thanks for your interest. We have just supported MXNet in Horovod and are
working on performance tuning and adding more examples. We are definitely
interested in further extending it's support with Kubeflow.

Let's set up some time to have a more detailed discussion.

Best,

Lin

On Wed, Jan 30, 2019 at 7:42 AM Yuan Tang  wrote:

> Hi,
>
> It's great to see MXNet-Horovod integration got merged:
> https://github.com/uber/horovod/pull/542
>
> Is there any future plan for this? I've been working on Kubeflow's
> MPI-Operator (https://github.com/kubeflow/mpi-operator) lately and it
> would
> be interesting to see an example of using Horovod + MXNet + Kubeflow using
> MPI Operator. Feel free to reach out (@terrytangyuan
> <https://github.com/terrytangyuan>) if you encounter any issues.
>
> Best,
> Yuan
>
>
> On Fri, Nov 2, 2018 at 6:51 PM Lin Yuan  wrote:
>
> > Hi Mu,
> >
> > Darren (@yuxihu <https://github.com/yuxihu>) and I have been working on
> > releasing MXNet-Horovod integration in production. We have made some
> > changes on both MXNet and Horovod sides. The changes on MXNet side have
> > mostly been merged and we are working to merge code to horovod repo. We
> > will send a design doc to you for review again next week.
> >
> > Thanks for your feedback,
> >
> > Lin
> >
> > On Wed, Oct 31, 2018 at 12:03 PM Mu Li  wrote:
> >
> > > Thanks for your contribution, Carl.
> > >
> > > I remember I left a comment on the proposal, but today I found it was
> > > disappeared. My suggestion is trying best to not change the existing
> API.
> > > The reason is that we need to change all trainers on the frontend that
> > uses
> > > the existing kvstore APIs, which may cause confusion to users.
> > >
> > > The current proposal wants add the following 4 APIs into kvstore:
> > >
> > >
> > >-
> > >
> > >kv.pushpull
> > >-
> > >
> > >kv.broadcast
> > >-
> > >
> > >kv.local_rank
> > >-
> > >
> > >kv.num_local_workers
> > >
> > >
> > > Pushpull can be done with a sequential push and pull, you can do
> nothing
> > in
> > > push and put all workloads into pushpull. Broadcast can be implemented
> by
> > > pull.
> > >
> > > What's local workers? GPUs in the single machine? If so, we can query
> it
> > > directly.
> > >
> > >
> > > On Fri, Sep 14, 2018 at 4:46 PM Carl Yang  wrote:
> > >
> > > > Hi,
> > > >
> > > > Currently, MXNet distributed can only be done using parameter server.
> > > > Horovod is an open-source distributed training framework that has
> > > > shown 2x speedup compared to TensorFlow using Parameter Server. We
> > > > propose to add Horovod support to MXNet. This will help our users
> > > > achieve goal of linear scalability to 256 GPUs and beyond. Design
> > > > proposal on cwiki:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Horovod-MXNet+Integration
> > > >
> > > > Please feel free to let me know if you have any suggestions or
> > feedback.
> > > >
> > > > Regards,
> > > > Carl
> > > >
> > >
> >
>


Re: [Announcement] New Committer -- Lin Yuan

2019-02-04 Thread Lin Yuan
Thanks folks! I am looking forward to working with you to make MXNet shine
in 2019!

Best,

Lin

On Sun, Feb 3, 2019 at 4:31 PM Qing Lan  wrote:

> Congrats Lin!
> >
> >
> > Congratulations Lin
> >
> >> On Sat, Feb 2, 2019, 3:27 PM Tianqi Chen  wrote:
> >>
> >> Dear Community:
> >>
> >> Please join me to welcome Lin Yuan(@apeforest) as a new committer of
> >> Apache(incubating) MXNet!
> >>
> >> He has contributed to various improvements, including better
> compatibility
> >> of larger arrays across the codebase.
> >>
> >> Commits:
> >> https://github.com/apache/incubator-mxnet/commits?author=apeforest
> >>
> >>
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aapeforest
> >>
> >>
> >> Reviews:
> >> https://github.com/apache/incubator-mxnet/pulls?utf8=%
> >> E2%9C%93=reviewed-by%3Aapeforest
> >>
> >> dev@ activitivity
> >> https://lists.apache.org/list.html?*@mxnet.apache.org:lte=6M:Lin%20Yuan
> >>
> >> Tianqi
> >>
>


Re: [VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-04 Thread Lin Yuan
+1 build from source on MacOS 10.13.6 and tested mxnet-to-coreml converter.

On Mon, Feb 4, 2019 at 9:03 AM Indhu  wrote:

> +1
>
> Build from source and tested few examples from the examples folder.
>
> Thanks,
> Indu
>
>
>
> On Fri, Feb 1, 2019 at 6:21 PM Steffen Rochel 
> wrote:
>
> > Hi Sheng - thanks for the feedback.
> > TVM notice  file is missing as the 1.4.x branch/v1.4.0 release is using
> TVM
> > commit 0f053c8
> > <
> >
> https://github.com/dmlc/tvm/commit/0f053c82a747b4dcdf49570ec87c17e0067b7439
> > >
> >  from Oct 8, 2018, which didn't have the NOTICE file. IMHO, MXNet NOTICE
> > file is consistent with release content.
> > As the release started in 2018 I do think it is ok to move forward w/o
> > update to 2019 IMHO.
> >
> > All -
> > thanks to the committers/contributors (Tao, Aaron, Kellen, Aston, Yuxi)
> who
> > tested and provided feedback - we have five +1 votes.
> > As of today, Friday Feb 1st 2019 6pm PST we have two binding votes, one
> +1
> > (Carin), one +0 (Sheng). The vote continues be open waiting for feedback
> > from PMC members.
> > Hope you can spare some time over the weekend to provide feedback.
> >
> > Regards,
> > Steffen
> >
> > On Fri, Feb 1, 2019 at 12:44 AM Marco de Abreu 
> > wrote:
> >
> > > Considering the release process has been started last year and the code
> > tag
> > > has also been based on last year, I'd say that it is not really a big
> > deal.
> > >
> > > -Marco
> > >
> > > Am Fr., 1. Feb. 2019, 09:33 hat Sheng Zha 
> > > geschrieben:
> > >
> > > > I found an awesome checklist for incubator releases [1] so I'm using
> it
> > > > here:
> > > >
> > > > -[Y] Are release files in correct location?
> > > > -[Y] Do release files have the word incubating in their name?
> > > > -[Y] Are the digital signature and hashes correct?
> > > > -[Y] Does DISCLAIMER file exist?
> > > > -[Y] Do LICENSE and NOTICE files exists?
> > > > -[N/A] Is the LICENSE and NOTICE text correct? (sz: did not finish
> > > > checking)
> > > > -[N] Is the NOTICE year correct?
> > > > -[N/A] Un-included software dependencies are not mentioned in LICENSE
> > or
> > > > NOTICE? (sz: did not finish checking)
> > > > -[Y] License information is not mentioned in NOTICE?
> > > > Is there any 3rd party code contained inside the release? If so:
> > > > -[Y] Does the software have a compatible license?
> > > > -[Y] Are all software licenses mentioned in LICENSE?
> > > > -[Y] Is the full text of the licenses (or pointers to it) in LICENSE?
> > > > Is any of this code Apache licensed? Do they have NOTICE files? If
> so:
> > > > -[N] Have relevant parts of those NOTICE files been added to this
> > NOTICE
> > > > file?
> > > > TVM has Apache 2.0 license and its NOTICE hasn't been added to
> MXNet's
> > > > NOTICE file.
> > > > -[Y] Do all source files have ASF headers? (sz: enforced by license
> > > > checker)
> > > > -[Y] Do the contents of the release match with what's tagged in
> version
> > > > control?
> > > > -[N] Are there any unexpected binary files in the release?
> > > > -[Y] Can you compile from source? Are the instruction clear?
> > > >
> > > > Is the issue minor?
> > > > - Unsure. NOTICE year is wrong (it's 2019 now). TVM's NOTICE is
> missing
> > > > from MXNet's NOTICE file.
> > > > Could it possibly be fixed in the next release?
> > > > - Yes
> > > > I vote with:
> > > > +0 not sure if it should be released. Could mentors advise if we
> should
> > > fix
> > > > them before release?
> > > >
> > > > [1] https://wiki.apache.org/incubator/IncubatorReleaseChecklist
> > > >
> > > >
> > > > On Thu, Jan 31, 2019 at 10:56 PM Lv, Tao A 
> wrote:
> > > >
> > > > >
> > > > > +1. Verified below items:
> > > > >
> > > > > 1. Checkout code from tag 1.4.0rc2 and build mkldnn backend
> > > successfully
> > > > > on both cpu and gpu w/ mkl and openblas
> > > > > 2. ResNet50v1 FP32 performance looks good for both latency and
> > > throughput
> > > > > 3. Quantization script works well with ResNet50v1
> > > > > 4. ResNet50v1 INT8 model accuracy looks good
> > > > > 5. ResNet50v1 INT8 model performance speedup looks good for both
> > > latency
> > > > > and throughput
> > > > >
> > > > >
> > > > > -Original Message-
> > > > > From: kellen sunderland [mailto:kellen.sunderl...@gmail.com]
> > > > > Sent: Friday, February 1, 2019 11:45 AM
> > > > > To: dev@mxnet.incubator.apache.org
> > > > > Subject: Re: [VOTE] Release Apache MXNet (incubating) version
> > 1.4.0.rc2
> > > > >
> > > > > Great, thanks Steffen!  I added a few key files but missed that
> one.
> > > > >
> > > > > +1 from me.
> > > > >
> > > > > On Thu, Jan 31, 2019 at 9:35 AM Steffen Rochel <
> > > steffenroc...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Kellen - Sergey, the 1.4.0 release co-manager signed the tar
> file.
> > > > > > Please use his public key to validate the asc.
> > > > > > I was able to validate:
> > > > > >
> > > > > > curl https://dist.apache.org/repos/dist/dev/incubator/mxnet/KEYS
> > -o
> > > > > > KEYS
> > > > 

Re: [Announcement] New Committer -- Steffen Rochel

2019-02-05 Thread Lin Yuan
Welcome Steffen!

Lin

On Mon, Feb 4, 2019 at 7:53 PM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> Great news.  Congrats Steffen.
>
> On Mon, Feb 4, 2019, 5:29 PM Thomas DELTEIL  wrote:
>
> > Welcome Steffen!
> >
> > On Mon, Feb 4, 2019, 15:55 Marco de Abreu  wrote:
> >
> > > Welcome!
> > >
> > > Am Di., 5. Feb. 2019, 00:45 hat Chris Olivier 
> > > geschrieben:
> > >
> > > > Dear Community:
> > > >
> > > > Please join me to welcome Steffen Rochel (steffenroc...@gmail.com)
> as
> > a
> > > > new
> > > > committer of Apache (incubating) MXNet!
> > > >
> > > > Steffen has played a role in nearly every MXNet release in the past
> 18
> > > > months, managed several of the wiki pages and has contributed in
> > > expanding
> > > > the community by managing and hosting meetups in different parts of
> the
> > > > world.
> > > >
> > > > -Chris
> > > >
> > >
> >
>


Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

2019-04-08 Thread Lin Yuan
Chai,

Thanks for sharing. This is awesome news!

Lin

On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat  wrote:

> Greetings!
>
> Great start to a Monday morning, as I came across this news on Import AI,
> an AI newsletter.
>
> The newsletter talked about Apache MXNet, hence thought of sharing it with
> our community. This seems to be a great achievement worth paying attention
> to.
>
> *75 seconds: How long it takes to train a network against ImageNet:*
> *...Fujitsu Research claims state-of-the-art ImageNet training scheme...*
> Researchers with Fujitsu Laboratories in Japan have further reduced the
> time it takes to train large-scale, supervised learning AI models; their
> approach lets them train a residual network to around 75% accuracy on the
> ImageNet dataset after 74.7 seconds of training time. This is a big leap
> from where we were in 2017 (an hour), and is impressive relative to
> late-2018 performance (around 4 minutes: see issue #121
> <
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=28edafc07a=0b77acb987
> >
> ).
>
> *How they did it: *The researchers trained their system across *2,048 Tesla
> V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
> used a large mini-batch size of 81,920, and also implemented layer-wise
> adaptive scaling (LARS) and a 'warming up' period to increase learning
> efficiency.
>
> *Why it matters:* Training large models on distributed infrastructure is a
> key component of modern AI research, and the reduction in time we've seen
> on ImageNet training is striking - I think this is emblematic of the
> industrialization of AI, as people seek to create systematic approaches to
> efficiently training models across large amounts of computers. This trend
> ultimately leads to a speedup in the rate of research reliant on
> large-scale experimentation, and can unlock new paths of research.
> *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on ImageNet
> in 74.7 seconds (Arxiv)
> <
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=d2b13c879f=0b77acb987
> >
> .
>
> NVIDIA article -
>
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
>
> Hope that gives further impetus to strive harder!
> Have a good week!
> Chai
>
>  --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> [image: https://www.facebook.com/chaibapat
> ]
> [image:
> https://twitter.com/ChaiBapchya] [image:
> https://www.linkedin.com//in/chaibapat25]
> 
>


Re: [QUESTION] mxnet/Tuple vs nnvm/Tuple

2019-04-16 Thread Lin Yuan
Jun,

Thanks! I was also leaning towards your suggestion.
I have updated nnvm::Tuple to mxnet::Tuple for a few remaining places in
MXNet.

Best,

Lin

On Tue, Apr 16, 2019 at 11:35 AM Jun Wu  wrote:

> include/mxnet/tuple.h was first copied from nnvm in this PR
> <https://github.com/apache/incubator-mxnet/pull/14270> so that we can make
> changes on it to support zero-dim and zero-size tensors without affecting
> TVM project. That PR has changed most of the places where nnvm::Tuple and
> nnvm::TShape were used to mxnet::Tuple and mxnet::TShape. If we still see a
> few locations not changed in the current codebase, we should change them to
> use mxnet Tuple as well for better cosmetics. The nnvm/tuple.h can be
> deprecated in MXNet.
>
> On Mon, Apr 15, 2019 at 10:44 PM Lin Yuan  wrote:
>
> > Dear Community,
> >
> > Currently in MXNet there are two Tuple template class defined in
> > mxnet/tuple.h and nnvm/tuple.h respectively. These two templates are
> higly
> > similar and most part are duplicated except for a couple of functions.
> > However, they were used mixedly in current codebase and causing conflict
> > sometimes.
> >
> > Is there any historical reason that we keep two copies of the same
> template
> > class? If not, can we refactor the code to consolidate into one?
> >
> > Thanks!
> >
> > Lin
> >
>


[QUESTION] mxnet/Tuple vs nnvm/Tuple

2019-04-15 Thread Lin Yuan
Dear Community,

Currently in MXNet there are two Tuple template class defined in
mxnet/tuple.h and nnvm/tuple.h respectively. These two templates are higly
similar and most part are duplicated except for a couple of functions.
However, they were used mixedly in current codebase and causing conflict
sometimes.

Is there any historical reason that we keep two copies of the same template
class? If not, can we refactor the code to consolidate into one?

Thanks!

Lin


Re: [Announcement] New Committer - Zhennan Qin

2019-04-30 Thread Lin Yuan
Congrats, Zhennan! Well deserved.

Lin

On Tue, Apr 30, 2019 at 3:07 PM Zhao, Patric  wrote:

> Cong, Zhennan.
>
> Really great works and it makes the MXNet/Quantization flow outstanding
> over the world!
>
> > -Original Message-
> > From: Lv, Tao A [mailto:tao.a...@intel.com]
> > Sent: Tuesday, April 30, 2019 11:01 PM
> > To: dev@mxnet.incubator.apache.org
> > Subject: RE: [Announcement] New Committer - Zhennan Qin
> >
> > Congratulations Zhennan!
> >
> > -Original Message-
> > From: Jun Wu [mailto:wujun@gmail.com]
> > Sent: Tuesday, April 30, 2019 12:29 PM
> > To: dev@mxnet.incubator.apache.org
> > Subject: [Announcement] New Committer - Zhennan Qin
> >
> > Please join me in welcoming Zhennan Qin (https://github.com/ZhennanQin)
> > from Intel as a new committer.
> >
> > Zhennan is the main author of accelerating MXNet/MKLDNN inference
> > through operator fusion and model quantization. His work has placed MXNet
> > in an advantageous place for inference workloads on Intel CPUs compared
> > with other DL frameworks.
>


Re: [Announcement] New Committer - Hao Jin

2019-05-01 Thread Lin Yuan
Congrats!

On Tue, Apr 30, 2019 at 11:28 PM Alex Zai  wrote:

> Congrats Hao!
>
> On Tue, Apr 30, 2019 at 10:53 PM Steffen Rochel 
> wrote:
>
> > congratulation Hao!
> >
> > On Tue, Apr 30, 2019 at 8:05 AM MiraiWK WKCN  wrote:
> >
> > > Congrats Hao! Welcome!
> > >
> > > 
> > > From: Lv, Tao A 
> > > Sent: Tuesday, April 30, 2019 11:00:33 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: RE: [Announcement] New Committer - Hao Jin
> > >
> > > Congratulations Hao!
> > >
> > > -Original Message-
> > > From: Jun Wu [mailto:wujun@gmail.com]
> > > Sent: Tuesday, April 30, 2019 12:29 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: [Announcement] New Committer - Hao Jin
> > >
> > > Please join me in welcoming Hao Jin (https://github.com/haojin2) from
> > AWS
> > > as a new committer.
> > >
> > > Hao has designed and implemented many sophisticated algorithms for
> tensor
> > > operations. His work has greatly expanded the coverage of MXNet
> operator
> > > inventory and enhanced the performance of many operators that are hard
> to
> > > be optimized. Not only that, Hao has been active in advocating MXNet
> > > through providing high-quality translation service for quite a few
> > > technical articles and blog posts.
> > >
> >
>


Re: [RFC] Support for creation of Large Tensors in MXNet

2019-04-29 Thread Lin Yuan
Tao,

- what's the max size of dimensionality? Which data type is used to define
dimensionality (ndims)?
We assume the max size of dimensionality is relatively small. Hence `int`
data type is used to define ndim

- what's the max size of each dimension? Which data type is used to define
dimension size (shape[x])?
Currently, we assume the max size of each dimension is not going to exceed
2^31 in real applications. Hence the data type is `int32_t`

- what's the max size of total elements? Which data type is used to define
element size (Prod(shape))?
We assume the total number of elements in a tensor can be larger than 2^32
in some applications such as deep graph library. We use the data type
`int64_t` to represent the total element size. Currently due to performance
regression in some operators (such as transpose), we used a compiler flag
to set this data type to `int32_t` by default. Once we have ways to
mitigate the performance regression, we will set the default data type to
`int64_t`, which is part of the effort in this project that Rohit proposed.

What is the plan in MKLDNN to support large tensors? We may want to
coordinate the progress since many operators are using MKLDNN
implementation in CPU now.

Many Thanks,

Lin

On Sun, Apr 28, 2019 at 7:52 PM Lv, Tao A  wrote:

> Thank you for bringing this topic to dev, Rohit.
>
> Regarding large tensor, can you articulate:
> - what's the max size of dimensionality? Which data type is used to define
> dimensionality (ndims)?
> - what's the max size of each dimension? Which data type is used to define
> dimension size (shape[x])?
> - what's the max size of total elements? Which data type is used to define
> element size (Prod(shape))?
>
> For me, any of these three can be *large*.
>
> -Original Message-
> From: Srivastava, Rohit Kumar [mailto:srivastava@buckeyemail.osu.edu]
> Sent: Saturday, April 27, 2019 7:33 AM
> To: dev@mxnet.incubator.apache.org
> Subject: [RFC] Support for creation of Large Tensors in MXNet
>
> Dear Community,
>
> Currently MXNet supports creation of Tensors containing up to 2^32
> elements. However there are cases where tensors of size over 5 billion is
> required
>
> We plan to support creation of large tensors on MXNet. A design proposal
> is ready for review:
> https://cwiki.apache.org/confluence/display/MXNET/Large+Tensor+Support
>
> We will appreciate any help and feedbacks from the community.
>
> Thank you!
>
> Rohit
>