Re: [Announcement] New Committer - Wang Jiajun

2019-04-16 Thread kellen sunderland
Welcome!  Very impressed with the work fixing memory leaks so far.

On Tue, Apr 16, 2019 at 9:14 AM Carin Meier  wrote:

> Congrats!
>
> On Tue, Apr 16, 2019 at 11:58 AM Anirudh Subramanian <
> anirudh2...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Please join me to welcome Wang Jiajun (https://github.com/arcadiaphy)
> as a
> > new committer of Apache (incubating) MXNet!
> >
> > Wang has been solving some tough bugs with respect to memory leaks,
> process
> > fork handling, dependency engine issues and custom op exception handling.
> >
> > Issue Involvement:
> >
> >
> https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Aarcadiaphy
> >
> > PRs authored:
> >
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aarcadiaphy+
> >
> > Anirudh
> >
>


Re: [QUESTION] mxnet/Tuple vs nnvm/Tuple

2019-04-16 Thread Lin Yuan
Jun,

Thanks! I was also leaning towards your suggestion.
I have updated nnvm::Tuple to mxnet::Tuple for a few remaining places in
MXNet.

Best,

Lin

On Tue, Apr 16, 2019 at 11:35 AM Jun Wu  wrote:

> include/mxnet/tuple.h was first copied from nnvm in this PR
>  so that we can make
> changes on it to support zero-dim and zero-size tensors without affecting
> TVM project. That PR has changed most of the places where nnvm::Tuple and
> nnvm::TShape were used to mxnet::Tuple and mxnet::TShape. If we still see a
> few locations not changed in the current codebase, we should change them to
> use mxnet Tuple as well for better cosmetics. The nnvm/tuple.h can be
> deprecated in MXNet.
>
> On Mon, Apr 15, 2019 at 10:44 PM Lin Yuan  wrote:
>
> > Dear Community,
> >
> > Currently in MXNet there are two Tuple template class defined in
> > mxnet/tuple.h and nnvm/tuple.h respectively. These two templates are
> higly
> > similar and most part are duplicated except for a couple of functions.
> > However, they were used mixedly in current codebase and causing conflict
> > sometimes.
> >
> > Is there any historical reason that we keep two copies of the same
> template
> > class? If not, can we refactor the code to consolidate into one?
> >
> > Thanks!
> >
> > Lin
> >
>


Re: [QUESTION] mxnet/Tuple vs nnvm/Tuple

2019-04-16 Thread Jun Wu
include/mxnet/tuple.h was first copied from nnvm in this PR
 so that we can make
changes on it to support zero-dim and zero-size tensors without affecting
TVM project. That PR has changed most of the places where nnvm::Tuple and
nnvm::TShape were used to mxnet::Tuple and mxnet::TShape. If we still see a
few locations not changed in the current codebase, we should change them to
use mxnet Tuple as well for better cosmetics. The nnvm/tuple.h can be
deprecated in MXNet.

On Mon, Apr 15, 2019 at 10:44 PM Lin Yuan  wrote:

> Dear Community,
>
> Currently in MXNet there are two Tuple template class defined in
> mxnet/tuple.h and nnvm/tuple.h respectively. These two templates are higly
> similar and most part are duplicated except for a couple of functions.
> However, they were used mixedly in current codebase and causing conflict
> sometimes.
>
> Is there any historical reason that we keep two copies of the same template
> class? If not, can we refactor the code to consolidate into one?
>
> Thanks!
>
> Lin
>


Re: Changes to MPI-operator

2019-04-16 Thread Roshani Nagmote
Sounds good. We(Pinar, Vandana and me) are currently prototyping and we are
planning to start a discussion on dev list once we have some logical
conclusion.
We will share more details soon and seek feedback from the community.

Thanks,
Roshani

On Mon, Apr 15, 2019 at 5:30 PM Yuan Tang  wrote:

> I am cc’ing MXNet dev mailing list here.
>
> Thanks for the note Roshani. Look forward to seeing your contribution!
> Though let’s also discuss this in MXNet dev mailing list since other people
> (e.g. Carl and Lin) might be working on this as well to avoid duplicate
> work.
>
> Best,
> Yuan
>
> On Mon, Apr 15, 2019 at 5:51 PM Rong Ou  wrote:
>
>> Sounds great! Yes it would be nice to have some examples for MXNet.
>>
>> On Mon, Apr 15, 2019 at 3:36 PM Roshani Nagmote <
>> roshaninagmo...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I work on Apache MXNet and recently I used MPI-Operator to run
>>> distributed training with MXNet and horovod on Kubernetes.
>>> I with few other folks tried to adjust the capacity for a training job
>>> based on the available workers and restart the training job from where it
>>> left off if any worker goes away in between.
>>>
>>> To do this, we had to do a few modifications to MPI-operator. For
>>> example, updating workerReplicas and launcherRole. Currently, changes are
>>> in my repo and I will be making a PR on MPI-operator with these changes.
>>> Also, planning to contribute few examples. I wanted to reach out to you
>>> first before creating a PR.
>>>
>>> Please let me know what your thoughts are on this.
>>>
>>> Thanks,
>>> Roshani
>>>
>>


Re: [Announcement] New Committer - Wang Jiajun

2019-04-16 Thread Carin Meier
Congrats!

On Tue, Apr 16, 2019 at 11:58 AM Anirudh Subramanian 
wrote:

> Hi,
>
> Please join me to welcome Wang Jiajun (https://github.com/arcadiaphy) as a
> new committer of Apache (incubating) MXNet!
>
> Wang has been solving some tough bugs with respect to memory leaks, process
> fork handling, dependency engine issues and custom op exception handling.
>
> Issue Involvement:
>
> https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Aarcadiaphy
>
> PRs authored:
>
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aarcadiaphy+
>
> Anirudh
>


Re: Changes to MPI-operator

2019-04-15 Thread Yuan Tang
I am cc’ing MXNet dev mailing list here.

Thanks for the note Roshani. Look forward to seeing your contribution!
Though let’s also discuss this in MXNet dev mailing list since other people
(e.g. Carl and Lin) might be working on this as well to avoid duplicate
work.

Best,
Yuan

On Mon, Apr 15, 2019 at 5:51 PM Rong Ou  wrote:

> Sounds great! Yes it would be nice to have some examples for MXNet.
>
> On Mon, Apr 15, 2019 at 3:36 PM Roshani Nagmote 
> wrote:
>
>> Hi,
>>
>> I work on Apache MXNet and recently I used MPI-Operator to run
>> distributed training with MXNet and horovod on Kubernetes.
>> I with few other folks tried to adjust the capacity for a training job
>> based on the available workers and restart the training job from where it
>> left off if any worker goes away in between.
>>
>> To do this, we had to do a few modifications to MPI-operator. For
>> example, updating workerReplicas and launcherRole. Currently, changes are
>> in my repo and I will be making a PR on MPI-operator with these changes.
>> Also, planning to contribute few examples. I wanted to reach out to you
>> first before creating a PR.
>>
>> Please let me know what your thoughts are on this.
>>
>> Thanks,
>> Roshani
>>
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-12 Thread Tianqi Chen
+1.

While I like slack, personally,  I don't think we should treat slack as
public-archive. "everything that happens (also) happens in dev@"

Tianqi



On Fri, Apr 12, 2019 at 1:19 AM Marco de Abreu 
wrote:

> I'd prefer if we keep discussions on the dev-list instead of slack - feel
> free to open another thread.
>
> -Marco
>
> Pedro Larroy  schrieb am Fr., 12. Apr. 2019,
> 02:24:
>
> > I will respond in slack, so we don't derail the original thread's
> > topic with my points.
> >
> > Looking forward to your proposal.
> >
> > On Thu, Apr 11, 2019 at 1:00 PM Junru Shao 
> > wrote:
> > >
> > > I don't have idea about the following issues:
> > >
> > > 1) Reducing the abuse of inlined code moving more logic to
> implementation
> > > files and improve scoping which will also speed up compilation
> > > 2) Reduce runtime of some unit tests
> > > 3) Improve MXNet startup time
> > >
> > > Will be super interested to hear about your ideas :-)
> > >
> > >
> > > On Thu, Apr 11, 2019 at 12:52 PM Junru Shao 
> > wrote:
> > >
> > > > We have a systematic solution to go without ABI headache. I am
> > struggling
> > > > with some errants, and will share our proposal here as soon as I
> could.
> > > > This will be very interesting topic to discuss. Let's work hard
> > together
> > > > and make it perfect :-)
> > > >
> > > > On Thu, Apr 11, 2019 at 12:43 PM Pedro Larroy <
> > > > pedro.larroy.li...@gmail.com> wrote:
> > > >
> > > >> Thanks Marco for raising this issue. I think we can certainly do
> some
> > > >> improvements in modularization and build. At the same time Tianqi's
> > > >> point of view is important to consider and on point. I see a high
> risk
> > > >> of overengineering in such endeavor.
> > > >>
> > > >> I also see increased complexity, difficulty debugging, C++ ABI
> > > >> headaches, API compatibility, crashes inside a binary module, etc.
> > > >> which I don't want to deal with as a developer or even as an MXNet
> > > >> user. Does somebody have answers to these problems?
> > > >>
> > > >> If somebody thinks they have a good solution, by all means propose a
> > > >> design in the wiki, I think we are all open. Personally I see
> several
> > > >> other lower hanging fruits which need our attention:
> > > >>  * Simplifying our build logic,
> > > >>  * Cuda selection in CMake,
> > > >>  * Reducing the abuse of inlined code moving more logic to
> > > >> implementation files and improve scoping which will also speed up
> > > >> compilation, (some units take more than 5 minutes to build and lots
> of
> > > >> RAM in a top of the line CPU core)
> > > >>  * Reduce runtime of some unit tests
> > > >> And other  improvements in our codebase that would bring immediate
> > > >> benefits without the risks of overengineering of a plugin system. I
> > > >> also question our bandwidth for such an endeavor.
> > > >>  * Improve MXNet startup time.
> > > >>  * Thread safety
> > > >>
> > > >> I would say, let's apply the KISS principle, let's make the project
> > > >> fast to build, easy to work on, well documented and easy to
> contribute
> > > >> to before building the next Netscape browser. Otherwise we could
> save
> > > >> ourselves this exercise and switch to Rust directly.
> > > >>
> > > >> Pedro.
> > > >>
> > > >>
> > > >>
> > > >> On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen <
> tqc...@cs.washington.edu>
> > > >> wrote:
> > > >> >
> > > >> > Just to clarify. I am not questioning the usefulness of the
> > separation.
> > > >> > Just want to highlight the technical challenges here based on our
> > past
> > > >> > experiences.
> > > >> >
> > > >> > Crossing DLL boundaries in C++ can create quite a lot of problems,
> > > >> > especially some of the dependencies used a different version of
> the
> > > >> > compiler, follows static packaging or simply because of the
> dynamic
> > > >> linking
> > > >> > difference in windows. These problems could make this direction
> move
> > > >> less
> > > >> > appealing compared to focusing effort on other things.
> > > >> >
> > > >> > Technically, as a first step, it is possible to make dependencies
> > change
> > > >> > not change the global header files and via registration so that
> > changing
> > > >> > certain component won't trigger a global recompile in CMake. This
> is
> > > >> also a
> > > >> > required step toward some modularity.
> > > >> >
> > > >> > For plugins, solutions that use C ABI can be used for certain
> plugin
> > > >> > modules.
> > > >> >
> > > >> > Some of the discussion has been tied to what the interface should
> > look
> > > >> > like. I think we should use different threads for these and puts
> in
> > more
> > > >> > thoughts.
> > > >> >
> > > >> > Tianqi
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
> > > >> > kellen.sunderl...@gmail.com> wrote:
> > > >> >
> > > >> > > I think we can make some incremental progress.  My thoughts were
> > > >> along the
> > > >> > > lines of plugins (thinking about what 

Re: Benchmarking MXNet with different compilers and different OpenMP implementations (results)

2019-04-12 Thread Pedro Larroy
Are there any updates on this?

This is still affecting multiprocessing, some tests hang:

rces. For information on submitting this issue, please see
https://bugs.llvm.org/.
[INFO] Setting test np/mx/python random seeds, use
MXNET_TEST_SEED=2124604270 to reproduce.
Assertion failure at kmp_runtime.cpp(6479): __kmp_thread_pool == __null.
OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
OMP: Hint: Please submit a bug report with this message, compile and
run commands used, and machine configuration info including native
compiler and operating system versions. Faster response will be
obtained by including all program sources. For information on
submitting this issue, please see https://bugs.llvm.org/.
Assertion failure at kmp_runtime.cpp(6479): __kmp_thread_pool == __null.
OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
OMP: Hint: Please submit a bug report with this message, compile and
run commands used, and machine configuration info including native
compiler and operating system versions. Faster response will be
obtained by including all program sources. For information on
submitting this issue, please see https://bugs.llvm.org/.
^CException ignored in: >
Traceback (most recent call last):
  File "/home/piotr/mxnet_other/python/mxnet/gluon/data/dataloader.py",
line 595, in __del__
self._worker_pool.terminate()
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 567, in terminate
self._terminate()
  File "/usr/lib/python3.6/multiprocessing/util.py", line 186, in __call__
res = self._callback(*self._args, **self._kwargs)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 597, in
_terminate_pool
cls._help_stuff_finish(inqueue, task_handler, len(pool))
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 582, in
_help_stuff_finish
inqueue._rlock.acquire()
KeyboardInterrupt

Pedro.

On Thu, Feb 14, 2019 at 6:30 AM Tsukrov, Stanislav
 wrote:
>
> Thanks Aaron for the feedback.
>
> > As for your next steps, would you propose that cmake be brought up to 
> > parity?
> Yes. sse2 in cmake vs sse3 in make is a minor example without high impact. 
> There are others.
>
> > It seems strange that it causes slowness and if so, it shouldn't be 
> > recommended for now.
> There are some issues in the cmake-files code, that should be fixed. Some of 
> them are workarounded for the benchmark.
>
> Best Regards
>
> Stas
>
> On 14.02.19, 14:09, "Anton Chernov"  wrote:
>
> Thank you, Aaron, for your interest on the topic.
>
> My main previous proposal still stands: remove bundled OpenMP submodule 
> and
> use OpenMP provided by the environment [1]. This might lead to performance
> degradation in some cases where an old OpenMP library is used or thread
> affinity wasn't set properly. But that would be a problem of the
> environment, not MXNet.
>
> I described some alternative solutions in [1] as part of this [2] thread.
> Tricking the linker with symlinks in both cases should allow to avoid
> multiple OpenMP implementations linked simultaneously to MXNet. Windows
> questions would be still open.
>
> Best
> Anton
>
> [1] https://github.com/apache/incubator-mxnet/pull/12160
> [2]
> 
> https://lists.apache.org/thread.html/007d8db15a1782e1b20896a4050b62710d4ff0908c67b94af7cb0f8b@%3Cdev.mxnet.apache.org%3E
> [3]
> 
> https://lists.apache.org/thread.html/4827f0f742b6e7e070da350ea81226d059401527f3072ce8b33c1fdf@%3Cdev.mxnet.apache.org%3E
>
>
> вт, 12 февр. 2019 г. в 16:39, Aaron Markham :
>
> > This is really great research. I've often wondered what the difference
> > really is, and why it has to be so complicated. It seems the answer is
> > there isn't much difference and it shouldn't be as complex.
> > As for your next steps, would you propose that cmake be brought up to
> > parity? It seems strange that it causes slowness and if so, it 
> shouldn't be
> > recommended for now.
> > Also, testing for windows compliers might be quite important as install
> > stats suggest a significant portion of windows users. Wouldn't this 
> nudge
> > the decision of what to use as a rule going forward?
> > I ran into this submodule openmp issue on windows myself. How does that 
> get
> > fixed? Do we have to repackage all of the submodules to make sure they 
> use
> > the recommended implementation or they use what the system expects?
> >
> > Cheers,
> > Aaron
> >
> > On Tue, Feb 12, 2019, 04:37 Anton Chernov  wrote:
> >
> > > Dear MXNet community,
> > >
> > > Due to multiple problems related to OpenMP and stale proposed change 
> [1]
> > we
> > > have been working on gathering performance data on the impact of using
> > > different OpenMP implementations with MXNet (great thanks to Stanislav
> > > Tsukrov for the hard work). The results can be found here [2].
> > >
> > > As a short summary of the investigation: 

Re: duplicated nnvm code

2019-04-12 Thread Pedro Larroy
I would think that if we are using nnvm from tvm we should not have
duplicated code in our repository. I think we either use the
subrepository as a 3rdparty or assimilate the code in the codebase as
what is planned with mshadow. But I guess TVM is making heavy use of
nnvm, and this case might make sense to reause across projects.
@Tianqi?

On Thu, Apr 11, 2019 at 10:16 PM Junru Shao  wrote:
>
> We should remove 3rdparty/tvm/nnvm/gradient.cc.o imo
>
> On Thu, Apr 11, 2019 at 6:44 PM Pedro Larroy 
> wrote:
>
> > Hi
> >
> > I found that src/nnvm  and 3rdparty/tvm/nnvm/src/pass/  has duplicated
> > code that we are linking in:
> >
> > ./CMakeFiles/mxnet_static.dir/3rdparty/tvm/nnvm/src/pass/gradient.cc.o
> > ./CMakeFiles/mxnet_static.dir/src/nnvm/gradient.cc.o
> >
> > This can potentially cause problems when linking. The symbol that will
> > be used is left as an exercise to the readers.
> >
> > Is this intentional?  Should we address this?
> >
> > Pedro.
> >


Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-11 Thread Marco de Abreu
I'd prefer if we keep discussions on the dev-list instead of slack - feel
free to open another thread.

-Marco

Pedro Larroy  schrieb am Fr., 12. Apr. 2019,
02:24:

> I will respond in slack, so we don't derail the original thread's
> topic with my points.
>
> Looking forward to your proposal.
>
> On Thu, Apr 11, 2019 at 1:00 PM Junru Shao 
> wrote:
> >
> > I don't have idea about the following issues:
> >
> > 1) Reducing the abuse of inlined code moving more logic to implementation
> > files and improve scoping which will also speed up compilation
> > 2) Reduce runtime of some unit tests
> > 3) Improve MXNet startup time
> >
> > Will be super interested to hear about your ideas :-)
> >
> >
> > On Thu, Apr 11, 2019 at 12:52 PM Junru Shao 
> wrote:
> >
> > > We have a systematic solution to go without ABI headache. I am
> struggling
> > > with some errants, and will share our proposal here as soon as I could.
> > > This will be very interesting topic to discuss. Let's work hard
> together
> > > and make it perfect :-)
> > >
> > > On Thu, Apr 11, 2019 at 12:43 PM Pedro Larroy <
> > > pedro.larroy.li...@gmail.com> wrote:
> > >
> > >> Thanks Marco for raising this issue. I think we can certainly do some
> > >> improvements in modularization and build. At the same time Tianqi's
> > >> point of view is important to consider and on point. I see a high risk
> > >> of overengineering in such endeavor.
> > >>
> > >> I also see increased complexity, difficulty debugging, C++ ABI
> > >> headaches, API compatibility, crashes inside a binary module, etc.
> > >> which I don't want to deal with as a developer or even as an MXNet
> > >> user. Does somebody have answers to these problems?
> > >>
> > >> If somebody thinks they have a good solution, by all means propose a
> > >> design in the wiki, I think we are all open. Personally I see several
> > >> other lower hanging fruits which need our attention:
> > >>  * Simplifying our build logic,
> > >>  * Cuda selection in CMake,
> > >>  * Reducing the abuse of inlined code moving more logic to
> > >> implementation files and improve scoping which will also speed up
> > >> compilation, (some units take more than 5 minutes to build and lots of
> > >> RAM in a top of the line CPU core)
> > >>  * Reduce runtime of some unit tests
> > >> And other  improvements in our codebase that would bring immediate
> > >> benefits without the risks of overengineering of a plugin system. I
> > >> also question our bandwidth for such an endeavor.
> > >>  * Improve MXNet startup time.
> > >>  * Thread safety
> > >>
> > >> I would say, let's apply the KISS principle, let's make the project
> > >> fast to build, easy to work on, well documented and easy to contribute
> > >> to before building the next Netscape browser. Otherwise we could save
> > >> ourselves this exercise and switch to Rust directly.
> > >>
> > >> Pedro.
> > >>
> > >>
> > >>
> > >> On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen 
> > >> wrote:
> > >> >
> > >> > Just to clarify. I am not questioning the usefulness of the
> separation.
> > >> > Just want to highlight the technical challenges here based on our
> past
> > >> > experiences.
> > >> >
> > >> > Crossing DLL boundaries in C++ can create quite a lot of problems,
> > >> > especially some of the dependencies used a different version of the
> > >> > compiler, follows static packaging or simply because of the dynamic
> > >> linking
> > >> > difference in windows. These problems could make this direction move
> > >> less
> > >> > appealing compared to focusing effort on other things.
> > >> >
> > >> > Technically, as a first step, it is possible to make dependencies
> change
> > >> > not change the global header files and via registration so that
> changing
> > >> > certain component won't trigger a global recompile in CMake. This is
> > >> also a
> > >> > required step toward some modularity.
> > >> >
> > >> > For plugins, solutions that use C ABI can be used for certain plugin
> > >> > modules.
> > >> >
> > >> > Some of the discussion has been tied to what the interface should
> look
> > >> > like. I think we should use different threads for these and puts in
> more
> > >> > thoughts.
> > >> >
> > >> > Tianqi
> > >> >
> > >> >
> > >> >
> > >> > On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
> > >> > kellen.sunderl...@gmail.com> wrote:
> > >> >
> > >> > > I think we can make some incremental progress.  My thoughts were
> > >> along the
> > >> > > lines of plugins (thinking about what happens with the VLC
> project).
> > >> At
> > >> > > process launch time we could gather some information about our
> > >> execution
> > >> > > environment (either through configuration, or by convention
> looking
> > >> at our
> > >> > > folder structure and libraries available).  We could then later
> load
> > >> the
> > >> > > components we need after understanding if we're using a CUDA
> backend
> > >> and
> > >> > > what operators or subgraph components we would need.  Advantages
> > >> 

Re: duplicated nnvm code

2019-04-11 Thread Junru Shao
We should remove 3rdparty/tvm/nnvm/gradient.cc.o imo

On Thu, Apr 11, 2019 at 6:44 PM Pedro Larroy 
wrote:

> Hi
>
> I found that src/nnvm  and 3rdparty/tvm/nnvm/src/pass/  has duplicated
> code that we are linking in:
>
> ./CMakeFiles/mxnet_static.dir/3rdparty/tvm/nnvm/src/pass/gradient.cc.o
> ./CMakeFiles/mxnet_static.dir/src/nnvm/gradient.cc.o
>
> This can potentially cause problems when linking. The symbol that will
> be used is left as an exercise to the readers.
>
> Is this intentional?  Should we address this?
>
> Pedro.
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-11 Thread Pedro Larroy
I will respond in slack, so we don't derail the original thread's
topic with my points.

Looking forward to your proposal.

On Thu, Apr 11, 2019 at 1:00 PM Junru Shao  wrote:
>
> I don't have idea about the following issues:
>
> 1) Reducing the abuse of inlined code moving more logic to implementation
> files and improve scoping which will also speed up compilation
> 2) Reduce runtime of some unit tests
> 3) Improve MXNet startup time
>
> Will be super interested to hear about your ideas :-)
>
>
> On Thu, Apr 11, 2019 at 12:52 PM Junru Shao  wrote:
>
> > We have a systematic solution to go without ABI headache. I am struggling
> > with some errants, and will share our proposal here as soon as I could.
> > This will be very interesting topic to discuss. Let's work hard together
> > and make it perfect :-)
> >
> > On Thu, Apr 11, 2019 at 12:43 PM Pedro Larroy <
> > pedro.larroy.li...@gmail.com> wrote:
> >
> >> Thanks Marco for raising this issue. I think we can certainly do some
> >> improvements in modularization and build. At the same time Tianqi's
> >> point of view is important to consider and on point. I see a high risk
> >> of overengineering in such endeavor.
> >>
> >> I also see increased complexity, difficulty debugging, C++ ABI
> >> headaches, API compatibility, crashes inside a binary module, etc.
> >> which I don't want to deal with as a developer or even as an MXNet
> >> user. Does somebody have answers to these problems?
> >>
> >> If somebody thinks they have a good solution, by all means propose a
> >> design in the wiki, I think we are all open. Personally I see several
> >> other lower hanging fruits which need our attention:
> >>  * Simplifying our build logic,
> >>  * Cuda selection in CMake,
> >>  * Reducing the abuse of inlined code moving more logic to
> >> implementation files and improve scoping which will also speed up
> >> compilation, (some units take more than 5 minutes to build and lots of
> >> RAM in a top of the line CPU core)
> >>  * Reduce runtime of some unit tests
> >> And other  improvements in our codebase that would bring immediate
> >> benefits without the risks of overengineering of a plugin system. I
> >> also question our bandwidth for such an endeavor.
> >>  * Improve MXNet startup time.
> >>  * Thread safety
> >>
> >> I would say, let's apply the KISS principle, let's make the project
> >> fast to build, easy to work on, well documented and easy to contribute
> >> to before building the next Netscape browser. Otherwise we could save
> >> ourselves this exercise and switch to Rust directly.
> >>
> >> Pedro.
> >>
> >>
> >>
> >> On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen 
> >> wrote:
> >> >
> >> > Just to clarify. I am not questioning the usefulness of the separation.
> >> > Just want to highlight the technical challenges here based on our past
> >> > experiences.
> >> >
> >> > Crossing DLL boundaries in C++ can create quite a lot of problems,
> >> > especially some of the dependencies used a different version of the
> >> > compiler, follows static packaging or simply because of the dynamic
> >> linking
> >> > difference in windows. These problems could make this direction move
> >> less
> >> > appealing compared to focusing effort on other things.
> >> >
> >> > Technically, as a first step, it is possible to make dependencies change
> >> > not change the global header files and via registration so that changing
> >> > certain component won't trigger a global recompile in CMake. This is
> >> also a
> >> > required step toward some modularity.
> >> >
> >> > For plugins, solutions that use C ABI can be used for certain plugin
> >> > modules.
> >> >
> >> > Some of the discussion has been tied to what the interface should look
> >> > like. I think we should use different threads for these and puts in more
> >> > thoughts.
> >> >
> >> > Tianqi
> >> >
> >> >
> >> >
> >> > On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
> >> > kellen.sunderl...@gmail.com> wrote:
> >> >
> >> > > I think we can make some incremental progress.  My thoughts were
> >> along the
> >> > > lines of plugins (thinking about what happens with the VLC project).
> >> At
> >> > > process launch time we could gather some information about our
> >> execution
> >> > > environment (either through configuration, or by convention looking
> >> at our
> >> > > folder structure and libraries available).  We could then later load
> >> the
> >> > > components we need after understanding if we're using a CUDA backend
> >> and
> >> > > what operators or subgraph components we would need.  Advantages
> >> would be
> >> > > that we would move a lot of the current conditional compile logic to
> >> > > runtime, and automate a lot of it.  It would also make packaging
> >> binaries
> >> > > for targeted environments a little easier.  As an example we could
> >> compile
> >> > > once, then remove CUDA focused libraries for systems that are going
> >> to run
> >> > > on CPUs.
> >> > >
> >> > > On Sun, Apr 7, 2019 at 2:45 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Jun Wu
Thank all for the discussion. To make this PR unblocked by the discussion
of semver concerns, I made copies of the APIs I need to change and worked
on those copies. Thanks to Tong, Yizhi, and Sergey who updated all the
language bindings to use the new C APIs supporting zero-dim/size tensors,
this PR currently is free of SemVer concerns on either C APIs and any
language bindings.

Nevertheless, I think it's the time for us to ponder the necessity of
keeping all C APIs under the umbrella of semver. As Marco pointed out,
treating C APIs as user-facing APIs brings in the risk of diverging
frontend functionality as well as considerably increases the complexity of
development. We can conduct a survey among MXNet users by collecting the C
APIs they directly interact with and keep those under semver.

Thank all again for sharing the thoughts. We will keeping rolling out NumPy
compatible features after this PR and look forward to helping MXNet users
migrate to using NumPy features.

On Thu, Apr 11, 2019 at 12:32 PM Anirudh Subramanian 
wrote:

> Hi Marco,
>
> The backend private APIs in engine, executor, storage, ndarray etc. can
> still be changed.
> I understand that it may introduce code duplication, but introducing
> duplicate C APIs can still be better than the backend
> developer having to worry about different frontends. Not to mention a
> frontend which is not yet merged to the
> repo but in its own repo, these repos should also be considered consumers
> of MXNet API.
>
> Anirudh
>
> On Thu, Apr 11, 2019 at 12:12 PM Marco de Abreu 
> wrote:
>
> > Good point about the adoption speed for the different frontends, Anirudh.
> > While this is a quite valid argument, I'm afraid of the complexity it
> might
> > introduce as well as a risk of further diverging frontend functionality.
> >
> > I'd rather propose that we introduce a guideline to follow when changes
> to
> > C-APIs are being made. Part of that could be starting a thread like this
> > one that lays down the changes that are being made to the C-API. We could
> > then coordinate the changes to the different frontends and gather people
> > from the community who feel comfortable to do the changes in the
> respective
> > frontends. If nobody speaks up, the original proposer of that change
> could
> > be responsible to do the necessary changes.
> >
> > An adjacent topic for this discussion could be test coverage: We
> currently
> > have no tools to determine which frontend hits which C-API and where
> > changes have to be made. This might be a topic we should spark up again
> > separately.
> >
> > -Marco
> >
> > On Thu, Apr 11, 2019 at 8:55 PM Marco de Abreu 
> > wrote:
> >
> > > My personal opinion towards that discussion is that we should keep the
> > > C-API free from semantic versioning because otherwise we're introducing
> > two
> > > "fronts" that we have to maintain backwards compatibility for. By the
> > way,
> > > currently, we have no way to verify and guarantee the compatibility of
> > the
> > > C-API. The major issue I'd see with adding SemVer for the C-API is that
> > > this would increase the complexity of changes that are (in my opinion)
> > > entirely internal to MXNet by introducing another thing that developers
> > > would have to look out for - possibly introducing code duplication as
> > > described by Jun while not providing any clear benefits to me.
> > >
> > > If there is a use-case where people can not even use our C++ package,
> > then
> > > we could have discussions about introducing a user-facing C-API, but
> > right
> > > now this approach to interface with our C-API (although I know that
> > people
> > > use it) seem a bit like using undocumented Windows APIs: They work, but
> > > it's on your own risk, they might always break and there's no
> guarantee.
> > >
> > > -Marco
> > >
> > > On Thu, Apr 11, 2019 at 8:52 PM Anirudh Subramanian <
> > anirudh2...@gmail.com>
> > > wrote:
> > >
> > >> Hi Jun,
> > >>
> > >> Till now from what I have observed this has been an undocumented
> > guideline
> > >> to not break C APIs (example:
> > >>
> >
> https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199564999
> > >> ).
> > >> Although the C APIs are supposed to serve only as bridges for frontend
> > >> language bindings (exception being C Predict API), I think there are
> 3rd
> > >> party libraries like Horovod which are starting to
> > >> depend on it (https://github.com/apache/incubator-mxnet/pull/14615) .
> > >>
> > >> Also, since MXNet has a lot of frontend bindings ensuring backward
> > >> compatibility with semver can help frontend bindings adopt the new
> APIs
> > at
> > >> their own pace.
> > >>
> > >> Anirudh
> > >>
> > >>
> > >> On Thu, Apr 11, 2019 at 10:58 AM Jun Wu  wrote:
> > >>
> > >> > I'm not sure about whether C APIs should fall under semver. This is
> > the
> > >> > discussion we would like to have with the community.
> > >> >
> > >> > My thinking on this:
> > >> > 1. In most of the cases, C APIs only serve 

Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-11 Thread Junru Shao
I don't have idea about the following issues:

1) Reducing the abuse of inlined code moving more logic to implementation
files and improve scoping which will also speed up compilation
2) Reduce runtime of some unit tests
3) Improve MXNet startup time

Will be super interested to hear about your ideas :-)


On Thu, Apr 11, 2019 at 12:52 PM Junru Shao  wrote:

> We have a systematic solution to go without ABI headache. I am struggling
> with some errants, and will share our proposal here as soon as I could.
> This will be very interesting topic to discuss. Let's work hard together
> and make it perfect :-)
>
> On Thu, Apr 11, 2019 at 12:43 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com> wrote:
>
>> Thanks Marco for raising this issue. I think we can certainly do some
>> improvements in modularization and build. At the same time Tianqi's
>> point of view is important to consider and on point. I see a high risk
>> of overengineering in such endeavor.
>>
>> I also see increased complexity, difficulty debugging, C++ ABI
>> headaches, API compatibility, crashes inside a binary module, etc.
>> which I don't want to deal with as a developer or even as an MXNet
>> user. Does somebody have answers to these problems?
>>
>> If somebody thinks they have a good solution, by all means propose a
>> design in the wiki, I think we are all open. Personally I see several
>> other lower hanging fruits which need our attention:
>>  * Simplifying our build logic,
>>  * Cuda selection in CMake,
>>  * Reducing the abuse of inlined code moving more logic to
>> implementation files and improve scoping which will also speed up
>> compilation, (some units take more than 5 minutes to build and lots of
>> RAM in a top of the line CPU core)
>>  * Reduce runtime of some unit tests
>> And other  improvements in our codebase that would bring immediate
>> benefits without the risks of overengineering of a plugin system. I
>> also question our bandwidth for such an endeavor.
>>  * Improve MXNet startup time.
>>  * Thread safety
>>
>> I would say, let's apply the KISS principle, let's make the project
>> fast to build, easy to work on, well documented and easy to contribute
>> to before building the next Netscape browser. Otherwise we could save
>> ourselves this exercise and switch to Rust directly.
>>
>> Pedro.
>>
>>
>>
>> On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen 
>> wrote:
>> >
>> > Just to clarify. I am not questioning the usefulness of the separation.
>> > Just want to highlight the technical challenges here based on our past
>> > experiences.
>> >
>> > Crossing DLL boundaries in C++ can create quite a lot of problems,
>> > especially some of the dependencies used a different version of the
>> > compiler, follows static packaging or simply because of the dynamic
>> linking
>> > difference in windows. These problems could make this direction move
>> less
>> > appealing compared to focusing effort on other things.
>> >
>> > Technically, as a first step, it is possible to make dependencies change
>> > not change the global header files and via registration so that changing
>> > certain component won't trigger a global recompile in CMake. This is
>> also a
>> > required step toward some modularity.
>> >
>> > For plugins, solutions that use C ABI can be used for certain plugin
>> > modules.
>> >
>> > Some of the discussion has been tied to what the interface should look
>> > like. I think we should use different threads for these and puts in more
>> > thoughts.
>> >
>> > Tianqi
>> >
>> >
>> >
>> > On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
>> > kellen.sunderl...@gmail.com> wrote:
>> >
>> > > I think we can make some incremental progress.  My thoughts were
>> along the
>> > > lines of plugins (thinking about what happens with the VLC project).
>> At
>> > > process launch time we could gather some information about our
>> execution
>> > > environment (either through configuration, or by convention looking
>> at our
>> > > folder structure and libraries available).  We could then later load
>> the
>> > > components we need after understanding if we're using a CUDA backend
>> and
>> > > what operators or subgraph components we would need.  Advantages
>> would be
>> > > that we would move a lot of the current conditional compile logic to
>> > > runtime, and automate a lot of it.  It would also make packaging
>> binaries
>> > > for targeted environments a little easier.  As an example we could
>> compile
>> > > once, then remove CUDA focused libraries for systems that are going
>> to run
>> > > on CPUs.
>> > >
>> > > On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen 
>> > > wrote:
>> > >
>> > > > While I personally like the idea. This can be something that is
>> fairly
>> > > > technical challenging and I would caution against this idea vs
>> pushing
>> > > for
>> > > > good features and just allow runtime configuration.
>> > > >
>> > > > The main problem here is due to the C++ ABI. There is no standard
>> c++ ABI
>> > > > across compilers, 

Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-11 Thread Junru Shao
We have a systematic solution to go without ABI headache. I am struggling
with some errants, and will share our proposal here as soon as I could.
This will be very interesting topic to discuss. Let's work hard together
and make it perfect :-)

On Thu, Apr 11, 2019 at 12:43 PM Pedro Larroy 
wrote:

> Thanks Marco for raising this issue. I think we can certainly do some
> improvements in modularization and build. At the same time Tianqi's
> point of view is important to consider and on point. I see a high risk
> of overengineering in such endeavor.
>
> I also see increased complexity, difficulty debugging, C++ ABI
> headaches, API compatibility, crashes inside a binary module, etc.
> which I don't want to deal with as a developer or even as an MXNet
> user. Does somebody have answers to these problems?
>
> If somebody thinks they have a good solution, by all means propose a
> design in the wiki, I think we are all open. Personally I see several
> other lower hanging fruits which need our attention:
>  * Simplifying our build logic,
>  * Cuda selection in CMake,
>  * Reducing the abuse of inlined code moving more logic to
> implementation files and improve scoping which will also speed up
> compilation, (some units take more than 5 minutes to build and lots of
> RAM in a top of the line CPU core)
>  * Reduce runtime of some unit tests
> And other  improvements in our codebase that would bring immediate
> benefits without the risks of overengineering of a plugin system. I
> also question our bandwidth for such an endeavor.
>  * Improve MXNet startup time.
>  * Thread safety
>
> I would say, let's apply the KISS principle, let's make the project
> fast to build, easy to work on, well documented and easy to contribute
> to before building the next Netscape browser. Otherwise we could save
> ourselves this exercise and switch to Rust directly.
>
> Pedro.
>
>
>
> On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen 
> wrote:
> >
> > Just to clarify. I am not questioning the usefulness of the separation.
> > Just want to highlight the technical challenges here based on our past
> > experiences.
> >
> > Crossing DLL boundaries in C++ can create quite a lot of problems,
> > especially some of the dependencies used a different version of the
> > compiler, follows static packaging or simply because of the dynamic
> linking
> > difference in windows. These problems could make this direction move less
> > appealing compared to focusing effort on other things.
> >
> > Technically, as a first step, it is possible to make dependencies change
> > not change the global header files and via registration so that changing
> > certain component won't trigger a global recompile in CMake. This is
> also a
> > required step toward some modularity.
> >
> > For plugins, solutions that use C ABI can be used for certain plugin
> > modules.
> >
> > Some of the discussion has been tied to what the interface should look
> > like. I think we should use different threads for these and puts in more
> > thoughts.
> >
> > Tianqi
> >
> >
> >
> > On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > I think we can make some incremental progress.  My thoughts were along
> the
> > > lines of plugins (thinking about what happens with the VLC project).
> At
> > > process launch time we could gather some information about our
> execution
> > > environment (either through configuration, or by convention looking at
> our
> > > folder structure and libraries available).  We could then later load
> the
> > > components we need after understanding if we're using a CUDA backend
> and
> > > what operators or subgraph components we would need.  Advantages would
> be
> > > that we would move a lot of the current conditional compile logic to
> > > runtime, and automate a lot of it.  It would also make packaging
> binaries
> > > for targeted environments a little easier.  As an example we could
> compile
> > > once, then remove CUDA focused libraries for systems that are going to
> run
> > > on CPUs.
> > >
> > > On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen 
> > > wrote:
> > >
> > > > While I personally like the idea. This can be something that is
> fairly
> > > > technical challenging and I would caution against this idea vs
> pushing
> > > for
> > > > good features and just allow runtime configuration.
> > > >
> > > > The main problem here is due to the C++ ABI. There is no standard
> c++ ABI
> > > > across compilers, which means resorting to runtime DLL and dynamic
> > > loading
> > > > brings all sorts of technical problems, especially when multiple
> modules
> > > > depend on the same third dependency(CUDA runtime).
> > > > There is no good to go solution can be made here, especially given
> the
> > > > explosion of the backend variants and dependencies in C++.
> > > > A partial solution could be achieved, through the sole use of C ABI.
> > > > Combing this with code generation can result in some simplifications
> and
> > 

Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-11 Thread Pedro Larroy
Thanks Marco for raising this issue. I think we can certainly do some
improvements in modularization and build. At the same time Tianqi's
point of view is important to consider and on point. I see a high risk
of overengineering in such endeavor.

I also see increased complexity, difficulty debugging, C++ ABI
headaches, API compatibility, crashes inside a binary module, etc.
which I don't want to deal with as a developer or even as an MXNet
user. Does somebody have answers to these problems?

If somebody thinks they have a good solution, by all means propose a
design in the wiki, I think we are all open. Personally I see several
other lower hanging fruits which need our attention:
 * Simplifying our build logic,
 * Cuda selection in CMake,
 * Reducing the abuse of inlined code moving more logic to
implementation files and improve scoping which will also speed up
compilation, (some units take more than 5 minutes to build and lots of
RAM in a top of the line CPU core)
 * Reduce runtime of some unit tests
And other  improvements in our codebase that would bring immediate
benefits without the risks of overengineering of a plugin system. I
also question our bandwidth for such an endeavor.
 * Improve MXNet startup time.
 * Thread safety

I would say, let's apply the KISS principle, let's make the project
fast to build, easy to work on, well documented and easy to contribute
to before building the next Netscape browser. Otherwise we could save
ourselves this exercise and switch to Rust directly.

Pedro.



On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen  wrote:
>
> Just to clarify. I am not questioning the usefulness of the separation.
> Just want to highlight the technical challenges here based on our past
> experiences.
>
> Crossing DLL boundaries in C++ can create quite a lot of problems,
> especially some of the dependencies used a different version of the
> compiler, follows static packaging or simply because of the dynamic linking
> difference in windows. These problems could make this direction move less
> appealing compared to focusing effort on other things.
>
> Technically, as a first step, it is possible to make dependencies change
> not change the global header files and via registration so that changing
> certain component won't trigger a global recompile in CMake. This is also a
> required step toward some modularity.
>
> For plugins, solutions that use C ABI can be used for certain plugin
> modules.
>
> Some of the discussion has been tied to what the interface should look
> like. I think we should use different threads for these and puts in more
> thoughts.
>
> Tianqi
>
>
>
> On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
>
> > I think we can make some incremental progress.  My thoughts were along the
> > lines of plugins (thinking about what happens with the VLC project).  At
> > process launch time we could gather some information about our execution
> > environment (either through configuration, or by convention looking at our
> > folder structure and libraries available).  We could then later load the
> > components we need after understanding if we're using a CUDA backend and
> > what operators or subgraph components we would need.  Advantages would be
> > that we would move a lot of the current conditional compile logic to
> > runtime, and automate a lot of it.  It would also make packaging binaries
> > for targeted environments a little easier.  As an example we could compile
> > once, then remove CUDA focused libraries for systems that are going to run
> > on CPUs.
> >
> > On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen 
> > wrote:
> >
> > > While I personally like the idea. This can be something that is fairly
> > > technical challenging and I would caution against this idea vs pushing
> > for
> > > good features and just allow runtime configuration.
> > >
> > > The main problem here is due to the C++ ABI. There is no standard c++ ABI
> > > across compilers, which means resorting to runtime DLL and dynamic
> > loading
> > > brings all sorts of technical problems, especially when multiple modules
> > > depend on the same third dependency(CUDA runtime).
> > > There is no good to go solution can be made here, especially given the
> > > explosion of the backend variants and dependencies in C++.
> > > A partial solution could be achieved, through the sole use of C ABI.
> > > Combing this with code generation can result in some simplifications and
> > > enable some runtime loadable module. TVM does this, and perhaps MXNet
> > could
> > > reuse some of that component for operator libraries. Similarly, having a
> > > customizable operator library that is loadable via C ABI might be
> > possible.
> > >
> > > So to summarize, while I really like the idea of dynamically loadable
> > > modules. My past experience suggests that this will bring a lot of
> > > additional engineering burden and technical debts without significant
> > > benefit. I would suggest starting by 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Anirudh Subramanian
Hi Marco,

The backend private APIs in engine, executor, storage, ndarray etc. can
still be changed.
I understand that it may introduce code duplication, but introducing
duplicate C APIs can still be better than the backend
developer having to worry about different frontends. Not to mention a
frontend which is not yet merged to the
repo but in its own repo, these repos should also be considered consumers
of MXNet API.

Anirudh

On Thu, Apr 11, 2019 at 12:12 PM Marco de Abreu 
wrote:

> Good point about the adoption speed for the different frontends, Anirudh.
> While this is a quite valid argument, I'm afraid of the complexity it might
> introduce as well as a risk of further diverging frontend functionality.
>
> I'd rather propose that we introduce a guideline to follow when changes to
> C-APIs are being made. Part of that could be starting a thread like this
> one that lays down the changes that are being made to the C-API. We could
> then coordinate the changes to the different frontends and gather people
> from the community who feel comfortable to do the changes in the respective
> frontends. If nobody speaks up, the original proposer of that change could
> be responsible to do the necessary changes.
>
> An adjacent topic for this discussion could be test coverage: We currently
> have no tools to determine which frontend hits which C-API and where
> changes have to be made. This might be a topic we should spark up again
> separately.
>
> -Marco
>
> On Thu, Apr 11, 2019 at 8:55 PM Marco de Abreu 
> wrote:
>
> > My personal opinion towards that discussion is that we should keep the
> > C-API free from semantic versioning because otherwise we're introducing
> two
> > "fronts" that we have to maintain backwards compatibility for. By the
> way,
> > currently, we have no way to verify and guarantee the compatibility of
> the
> > C-API. The major issue I'd see with adding SemVer for the C-API is that
> > this would increase the complexity of changes that are (in my opinion)
> > entirely internal to MXNet by introducing another thing that developers
> > would have to look out for - possibly introducing code duplication as
> > described by Jun while not providing any clear benefits to me.
> >
> > If there is a use-case where people can not even use our C++ package,
> then
> > we could have discussions about introducing a user-facing C-API, but
> right
> > now this approach to interface with our C-API (although I know that
> people
> > use it) seem a bit like using undocumented Windows APIs: They work, but
> > it's on your own risk, they might always break and there's no guarantee.
> >
> > -Marco
> >
> > On Thu, Apr 11, 2019 at 8:52 PM Anirudh Subramanian <
> anirudh2...@gmail.com>
> > wrote:
> >
> >> Hi Jun,
> >>
> >> Till now from what I have observed this has been an undocumented
> guideline
> >> to not break C APIs (example:
> >>
> https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199564999
> >> ).
> >> Although the C APIs are supposed to serve only as bridges for frontend
> >> language bindings (exception being C Predict API), I think there are 3rd
> >> party libraries like Horovod which are starting to
> >> depend on it (https://github.com/apache/incubator-mxnet/pull/14615) .
> >>
> >> Also, since MXNet has a lot of frontend bindings ensuring backward
> >> compatibility with semver can help frontend bindings adopt the new APIs
> at
> >> their own pace.
> >>
> >> Anirudh
> >>
> >>
> >> On Thu, Apr 11, 2019 at 10:58 AM Jun Wu  wrote:
> >>
> >> > I'm not sure about whether C APIs should fall under semver. This is
> the
> >> > discussion we would like to have with the community.
> >> >
> >> > My thinking on this:
> >> > 1. In most of the cases, C APIs only serve as bridges between frontend
> >> > language bindings and C++ backend. Most of users/developers do not
> >> interact
> >> > directly with C APIs.
> >> > 2. The cases I can think of where C APIs are directly adopted in
> >> > application development are model deployment in a C/C++ environment.
> In
> >> > those cases, developers only interact with C Predict APIs, which we
> >> didn't
> >> > touch.
> >> >
> >> > If the community feel that we are obliged to keep the semver for all C
> >> > APIs, we can try to make a copy of the C APIs we intend to modify in
> >> the PR
> >> > and keep the old signatures intact, this will introduce a lot of
> >> duplicate
> >> > code though.
> >> >
> >> > On Thu, Apr 11, 2019 at 8:50 AM Anirudh Subramanian <
> >> anirudh2...@gmail.com
> >> > >
> >> > wrote:
> >> >
> >> > > I was under the impression that C API does fall under semver. Has
> this
> >> > been
> >> > > discussed somewhere before ? Is this also the case for C Predict
> API ?
> >> > >
> >> > > On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu <
> marco.g.ab...@gmail.com
> >> >
> >> > > wrote:
> >> > >
> >> > > > In case only changes to the c-api are being made, it doesn't fall
> >> under
> >> > > our
> >> > > > semantic versioning since that's not a 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Marco de Abreu
Good point about the adoption speed for the different frontends, Anirudh.
While this is a quite valid argument, I'm afraid of the complexity it might
introduce as well as a risk of further diverging frontend functionality.

I'd rather propose that we introduce a guideline to follow when changes to
C-APIs are being made. Part of that could be starting a thread like this
one that lays down the changes that are being made to the C-API. We could
then coordinate the changes to the different frontends and gather people
from the community who feel comfortable to do the changes in the respective
frontends. If nobody speaks up, the original proposer of that change could
be responsible to do the necessary changes.

An adjacent topic for this discussion could be test coverage: We currently
have no tools to determine which frontend hits which C-API and where
changes have to be made. This might be a topic we should spark up again
separately.

-Marco

On Thu, Apr 11, 2019 at 8:55 PM Marco de Abreu 
wrote:

> My personal opinion towards that discussion is that we should keep the
> C-API free from semantic versioning because otherwise we're introducing two
> "fronts" that we have to maintain backwards compatibility for. By the way,
> currently, we have no way to verify and guarantee the compatibility of the
> C-API. The major issue I'd see with adding SemVer for the C-API is that
> this would increase the complexity of changes that are (in my opinion)
> entirely internal to MXNet by introducing another thing that developers
> would have to look out for - possibly introducing code duplication as
> described by Jun while not providing any clear benefits to me.
>
> If there is a use-case where people can not even use our C++ package, then
> we could have discussions about introducing a user-facing C-API, but right
> now this approach to interface with our C-API (although I know that people
> use it) seem a bit like using undocumented Windows APIs: They work, but
> it's on your own risk, they might always break and there's no guarantee.
>
> -Marco
>
> On Thu, Apr 11, 2019 at 8:52 PM Anirudh Subramanian 
> wrote:
>
>> Hi Jun,
>>
>> Till now from what I have observed this has been an undocumented guideline
>> to not break C APIs (example:
>> https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199564999
>> ).
>> Although the C APIs are supposed to serve only as bridges for frontend
>> language bindings (exception being C Predict API), I think there are 3rd
>> party libraries like Horovod which are starting to
>> depend on it (https://github.com/apache/incubator-mxnet/pull/14615) .
>>
>> Also, since MXNet has a lot of frontend bindings ensuring backward
>> compatibility with semver can help frontend bindings adopt the new APIs at
>> their own pace.
>>
>> Anirudh
>>
>>
>> On Thu, Apr 11, 2019 at 10:58 AM Jun Wu  wrote:
>>
>> > I'm not sure about whether C APIs should fall under semver. This is the
>> > discussion we would like to have with the community.
>> >
>> > My thinking on this:
>> > 1. In most of the cases, C APIs only serve as bridges between frontend
>> > language bindings and C++ backend. Most of users/developers do not
>> interact
>> > directly with C APIs.
>> > 2. The cases I can think of where C APIs are directly adopted in
>> > application development are model deployment in a C/C++ environment. In
>> > those cases, developers only interact with C Predict APIs, which we
>> didn't
>> > touch.
>> >
>> > If the community feel that we are obliged to keep the semver for all C
>> > APIs, we can try to make a copy of the C APIs we intend to modify in
>> the PR
>> > and keep the old signatures intact, this will introduce a lot of
>> duplicate
>> > code though.
>> >
>> > On Thu, Apr 11, 2019 at 8:50 AM Anirudh Subramanian <
>> anirudh2...@gmail.com
>> > >
>> > wrote:
>> >
>> > > I was under the impression that C API does fall under semver. Has this
>> > been
>> > > discussed somewhere before ? Is this also the case for C Predict API ?
>> > >
>> > > On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu > >
>> > > wrote:
>> > >
>> > > > In case only changes to the c-api are being made, it doesn't fall
>> under
>> > > our
>> > > > semantic versioning since that's not a user facing API and thus I'd
>> be
>> > in
>> > > > favour as doing it as part of a minor release. If there is any
>> > > behavioural
>> > > > change from a user perspective (a good indicator would be if tests
>> have
>> > > to
>> > > > be changed as reaction to the Backend changes), then I'd prefer a
>> major
>> > > > release.
>> > > >
>> > > > I'd slightly prefer a minor release since this change touches quite
>> a
>> > few
>> > > > parts and could risk being outdated/diverged as the time until 2.0
>> > > > progresses.
>> > > >
>> > > > -Marco
>> > > >
>> > > > Aaron Markham  schrieb am Do., 11. Apr.
>> > 2019,
>> > > > 16:28:
>> > > >
>> > > > > Just curious about when this kind of change will land. Would it
>> wait
>> > > for
>> > > > > 2.0 or would it be 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Marco de Abreu
My personal opinion towards that discussion is that we should keep the
C-API free from semantic versioning because otherwise we're introducing two
"fronts" that we have to maintain backwards compatibility for. By the way,
currently, we have no way to verify and guarantee the compatibility of the
C-API. The major issue I'd see with adding SemVer for the C-API is that
this would increase the complexity of changes that are (in my opinion)
entirely internal to MXNet by introducing another thing that developers
would have to look out for - possibly introducing code duplication as
described by Jun while not providing any clear benefits to me.

If there is a use-case where people can not even use our C++ package, then
we could have discussions about introducing a user-facing C-API, but right
now this approach to interface with our C-API (although I know that people
use it) seem a bit like using undocumented Windows APIs: They work, but
it's on your own risk, they might always break and there's no guarantee.

-Marco

On Thu, Apr 11, 2019 at 8:52 PM Anirudh Subramanian 
wrote:

> Hi Jun,
>
> Till now from what I have observed this has been an undocumented guideline
> to not break C APIs (example:
> https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199564999
> ).
> Although the C APIs are supposed to serve only as bridges for frontend
> language bindings (exception being C Predict API), I think there are 3rd
> party libraries like Horovod which are starting to
> depend on it (https://github.com/apache/incubator-mxnet/pull/14615) .
>
> Also, since MXNet has a lot of frontend bindings ensuring backward
> compatibility with semver can help frontend bindings adopt the new APIs at
> their own pace.
>
> Anirudh
>
>
> On Thu, Apr 11, 2019 at 10:58 AM Jun Wu  wrote:
>
> > I'm not sure about whether C APIs should fall under semver. This is the
> > discussion we would like to have with the community.
> >
> > My thinking on this:
> > 1. In most of the cases, C APIs only serve as bridges between frontend
> > language bindings and C++ backend. Most of users/developers do not
> interact
> > directly with C APIs.
> > 2. The cases I can think of where C APIs are directly adopted in
> > application development are model deployment in a C/C++ environment. In
> > those cases, developers only interact with C Predict APIs, which we
> didn't
> > touch.
> >
> > If the community feel that we are obliged to keep the semver for all C
> > APIs, we can try to make a copy of the C APIs we intend to modify in the
> PR
> > and keep the old signatures intact, this will introduce a lot of
> duplicate
> > code though.
> >
> > On Thu, Apr 11, 2019 at 8:50 AM Anirudh Subramanian <
> anirudh2...@gmail.com
> > >
> > wrote:
> >
> > > I was under the impression that C API does fall under semver. Has this
> > been
> > > discussed somewhere before ? Is this also the case for C Predict API ?
> > >
> > > On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu 
> > > wrote:
> > >
> > > > In case only changes to the c-api are being made, it doesn't fall
> under
> > > our
> > > > semantic versioning since that's not a user facing API and thus I'd
> be
> > in
> > > > favour as doing it as part of a minor release. If there is any
> > > behavioural
> > > > change from a user perspective (a good indicator would be if tests
> have
> > > to
> > > > be changed as reaction to the Backend changes), then I'd prefer a
> major
> > > > release.
> > > >
> > > > I'd slightly prefer a minor release since this change touches quite a
> > few
> > > > parts and could risk being outdated/diverged as the time until 2.0
> > > > progresses.
> > > >
> > > > -Marco
> > > >
> > > > Aaron Markham  schrieb am Do., 11. Apr.
> > 2019,
> > > > 16:28:
> > > >
> > > > > Just curious about when this kind of change will land. Would it
> wait
> > > for
> > > > > 2.0 or would it be in 1.5 or another minor release?
> > > > >
> > > > > On Thu, Apr 11, 2019, 00:15 Junru Shao 
> > > wrote:
> > > > >
> > > > > > Really nice improvement over MXNet's usability! I suggest that we
> > > could
> > > > > > make numpy-compatible behavior default in 2.0.
> > > > > >
> > > > > > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu 
> > wrote:
> > > > > >
> > > > > > > Dear Community,
> > > > > > >
> > > > > > > A while ago, we sent out an RFC
> > > > > > > 
> > > discussing
> > > > > the
> > > > > > > initiative introducing NumPy compatibility into MXNet. As the
> > first
> > > > > > outcome
> > > > > > > of this initiative, we submitted the PR
> > > > > > > 
> providing
> > > the
> > > > > > > infrastructure of supporting zero-dim (scalar) and zero-size
> > > tensors,
> > > > > > which
> > > > > > > have been long-missing in MXNet.
> > > > > > >
> > > > > > > In our implementation, we have put the best efforts of keeping
> > the
> > > > > > promise
> > > > > > > of backward compatibility in all the language 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Anirudh Subramanian
Hi Jun,

Till now from what I have observed this has been an undocumented guideline
to not break C APIs (example:
https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199564999).
Although the C APIs are supposed to serve only as bridges for frontend
language bindings (exception being C Predict API), I think there are 3rd
party libraries like Horovod which are starting to
depend on it (https://github.com/apache/incubator-mxnet/pull/14615) .

Also, since MXNet has a lot of frontend bindings ensuring backward
compatibility with semver can help frontend bindings adopt the new APIs at
their own pace.

Anirudh


On Thu, Apr 11, 2019 at 10:58 AM Jun Wu  wrote:

> I'm not sure about whether C APIs should fall under semver. This is the
> discussion we would like to have with the community.
>
> My thinking on this:
> 1. In most of the cases, C APIs only serve as bridges between frontend
> language bindings and C++ backend. Most of users/developers do not interact
> directly with C APIs.
> 2. The cases I can think of where C APIs are directly adopted in
> application development are model deployment in a C/C++ environment. In
> those cases, developers only interact with C Predict APIs, which we didn't
> touch.
>
> If the community feel that we are obliged to keep the semver for all C
> APIs, we can try to make a copy of the C APIs we intend to modify in the PR
> and keep the old signatures intact, this will introduce a lot of duplicate
> code though.
>
> On Thu, Apr 11, 2019 at 8:50 AM Anirudh Subramanian  >
> wrote:
>
> > I was under the impression that C API does fall under semver. Has this
> been
> > discussed somewhere before ? Is this also the case for C Predict API ?
> >
> > On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu 
> > wrote:
> >
> > > In case only changes to the c-api are being made, it doesn't fall under
> > our
> > > semantic versioning since that's not a user facing API and thus I'd be
> in
> > > favour as doing it as part of a minor release. If there is any
> > behavioural
> > > change from a user perspective (a good indicator would be if tests have
> > to
> > > be changed as reaction to the Backend changes), then I'd prefer a major
> > > release.
> > >
> > > I'd slightly prefer a minor release since this change touches quite a
> few
> > > parts and could risk being outdated/diverged as the time until 2.0
> > > progresses.
> > >
> > > -Marco
> > >
> > > Aaron Markham  schrieb am Do., 11. Apr.
> 2019,
> > > 16:28:
> > >
> > > > Just curious about when this kind of change will land. Would it wait
> > for
> > > > 2.0 or would it be in 1.5 or another minor release?
> > > >
> > > > On Thu, Apr 11, 2019, 00:15 Junru Shao 
> > wrote:
> > > >
> > > > > Really nice improvement over MXNet's usability! I suggest that we
> > could
> > > > > make numpy-compatible behavior default in 2.0.
> > > > >
> > > > > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu 
> wrote:
> > > > >
> > > > > > Dear Community,
> > > > > >
> > > > > > A while ago, we sent out an RFC
> > > > > > 
> > discussing
> > > > the
> > > > > > initiative introducing NumPy compatibility into MXNet. As the
> first
> > > > > outcome
> > > > > > of this initiative, we submitted the PR
> > > > > >  providing
> > the
> > > > > > infrastructure of supporting zero-dim (scalar) and zero-size
> > tensors,
> > > > > which
> > > > > > have been long-missing in MXNet.
> > > > > >
> > > > > > In our implementation, we have put the best efforts of keeping
> the
> > > > > promise
> > > > > > of backward compatibility in all the language bindings.
> > Nevertheless,
> > > > we
> > > > > > still would like to call out the changes explicitly that may
> impact
> > > > your
> > > > > > existing codebases developed on top of MXNet by calling C-APIs
> > > directly
> > > > > or
> > > > > > implementing operators in your own repos.
> > > > > >
> > > > > > 1. In you application, if you called any one of the following
> > > > > shape-related
> > > > > > C-APIs, you will need to change the data type of shape's ndim and
> > > > > dim_size
> > > > > > from *unsigned int* to signed *int*, because we have to use -1 to
> > > > > represent
> > > > > > unknown shape information, and reserve 0 for scalar and zero-size
> > > > > tensors.
> > > > > > One example of such changes can be seen in the cpp-package
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> > > > > > >
> > > > > > calling MXSymbolInferShape.
> > > > > > - MXSymbolInfershape
> > > > > > - MXSymbolInfershapePartial
> > > > > > - MXExecutorSimpleBind
> > > > > > - MXExecutorReshape
> > > > > > - MXNDArrayGetShape
> > > > > > - MXNDArrayCreaetFromSharedMem
> > > > > >
> > > > > > 2. If you have implemented operators in your own codebases, you
> > will
> > > > > > probably need to change every operator's 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Marco de Abreu
Hi Jun,

we've had a previous discussion on this topic here:
https://lists.apache.org/thread.html/f0d7d96f9737479ec57580a977e9169544ffa1bc1a8ae21ab18fc6a0@%3Cdev.mxnet.apache.org%3E

Best regards,
Marco

On Thu, Apr 11, 2019 at 7:58 PM Jun Wu  wrote:

> I'm not sure about whether C APIs should fall under semver. This is the
> discussion we would like to have with the community.
>
> My thinking on this:
> 1. In most of the cases, C APIs only serve as bridges between frontend
> language bindings and C++ backend. Most of users/developers do not interact
> directly with C APIs.
> 2. The cases I can think of where C APIs are directly adopted in
> application development are model deployment in a C/C++ environment. In
> those cases, developers only interact with C Predict APIs, which we didn't
> touch.
>
> If the community feel that we are obliged to keep the semver for all C
> APIs, we can try to make a copy of the C APIs we intend to modify in the PR
> and keep the old signatures intact, this will introduce a lot of duplicate
> code though.
>
> On Thu, Apr 11, 2019 at 8:50 AM Anirudh Subramanian  >
> wrote:
>
> > I was under the impression that C API does fall under semver. Has this
> been
> > discussed somewhere before ? Is this also the case for C Predict API ?
> >
> > On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu 
> > wrote:
> >
> > > In case only changes to the c-api are being made, it doesn't fall under
> > our
> > > semantic versioning since that's not a user facing API and thus I'd be
> in
> > > favour as doing it as part of a minor release. If there is any
> > behavioural
> > > change from a user perspective (a good indicator would be if tests have
> > to
> > > be changed as reaction to the Backend changes), then I'd prefer a major
> > > release.
> > >
> > > I'd slightly prefer a minor release since this change touches quite a
> few
> > > parts and could risk being outdated/diverged as the time until 2.0
> > > progresses.
> > >
> > > -Marco
> > >
> > > Aaron Markham  schrieb am Do., 11. Apr.
> 2019,
> > > 16:28:
> > >
> > > > Just curious about when this kind of change will land. Would it wait
> > for
> > > > 2.0 or would it be in 1.5 or another minor release?
> > > >
> > > > On Thu, Apr 11, 2019, 00:15 Junru Shao 
> > wrote:
> > > >
> > > > > Really nice improvement over MXNet's usability! I suggest that we
> > could
> > > > > make numpy-compatible behavior default in 2.0.
> > > > >
> > > > > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu 
> wrote:
> > > > >
> > > > > > Dear Community,
> > > > > >
> > > > > > A while ago, we sent out an RFC
> > > > > > 
> > discussing
> > > > the
> > > > > > initiative introducing NumPy compatibility into MXNet. As the
> first
> > > > > outcome
> > > > > > of this initiative, we submitted the PR
> > > > > >  providing
> > the
> > > > > > infrastructure of supporting zero-dim (scalar) and zero-size
> > tensors,
> > > > > which
> > > > > > have been long-missing in MXNet.
> > > > > >
> > > > > > In our implementation, we have put the best efforts of keeping
> the
> > > > > promise
> > > > > > of backward compatibility in all the language bindings.
> > Nevertheless,
> > > > we
> > > > > > still would like to call out the changes explicitly that may
> impact
> > > > your
> > > > > > existing codebases developed on top of MXNet by calling C-APIs
> > > directly
> > > > > or
> > > > > > implementing operators in your own repos.
> > > > > >
> > > > > > 1. In you application, if you called any one of the following
> > > > > shape-related
> > > > > > C-APIs, you will need to change the data type of shape's ndim and
> > > > > dim_size
> > > > > > from *unsigned int* to signed *int*, because we have to use -1 to
> > > > > represent
> > > > > > unknown shape information, and reserve 0 for scalar and zero-size
> > > > > tensors.
> > > > > > One example of such changes can be seen in the cpp-package
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> > > > > > >
> > > > > > calling MXSymbolInferShape.
> > > > > > - MXSymbolInfershape
> > > > > > - MXSymbolInfershapePartial
> > > > > > - MXExecutorSimpleBind
> > > > > > - MXExecutorReshape
> > > > > > - MXNDArrayGetShape
> > > > > > - MXNDArrayCreaetFromSharedMem
> > > > > >
> > > > > > 2. If you have implemented operators in your own codebases, you
> > will
> > > > > > probably need to change every operator's shape inference function
> > to
> > > > use
> > > > > > the following util functions to check whether shape information
> is
> > > > known,
> > > > > > instead of checking against 0 directly. One example of such
> changes
> > > can
> > > > > be
> > > > > > seen in the shape inference function
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Jun Wu
I'm not sure about whether C APIs should fall under semver. This is the
discussion we would like to have with the community.

My thinking on this:
1. In most of the cases, C APIs only serve as bridges between frontend
language bindings and C++ backend. Most of users/developers do not interact
directly with C APIs.
2. The cases I can think of where C APIs are directly adopted in
application development are model deployment in a C/C++ environment. In
those cases, developers only interact with C Predict APIs, which we didn't
touch.

If the community feel that we are obliged to keep the semver for all C
APIs, we can try to make a copy of the C APIs we intend to modify in the PR
and keep the old signatures intact, this will introduce a lot of duplicate
code though.

On Thu, Apr 11, 2019 at 8:50 AM Anirudh Subramanian 
wrote:

> I was under the impression that C API does fall under semver. Has this been
> discussed somewhere before ? Is this also the case for C Predict API ?
>
> On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu 
> wrote:
>
> > In case only changes to the c-api are being made, it doesn't fall under
> our
> > semantic versioning since that's not a user facing API and thus I'd be in
> > favour as doing it as part of a minor release. If there is any
> behavioural
> > change from a user perspective (a good indicator would be if tests have
> to
> > be changed as reaction to the Backend changes), then I'd prefer a major
> > release.
> >
> > I'd slightly prefer a minor release since this change touches quite a few
> > parts and could risk being outdated/diverged as the time until 2.0
> > progresses.
> >
> > -Marco
> >
> > Aaron Markham  schrieb am Do., 11. Apr. 2019,
> > 16:28:
> >
> > > Just curious about when this kind of change will land. Would it wait
> for
> > > 2.0 or would it be in 1.5 or another minor release?
> > >
> > > On Thu, Apr 11, 2019, 00:15 Junru Shao 
> wrote:
> > >
> > > > Really nice improvement over MXNet's usability! I suggest that we
> could
> > > > make numpy-compatible behavior default in 2.0.
> > > >
> > > > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu  wrote:
> > > >
> > > > > Dear Community,
> > > > >
> > > > > A while ago, we sent out an RFC
> > > > > 
> discussing
> > > the
> > > > > initiative introducing NumPy compatibility into MXNet. As the first
> > > > outcome
> > > > > of this initiative, we submitted the PR
> > > > >  providing
> the
> > > > > infrastructure of supporting zero-dim (scalar) and zero-size
> tensors,
> > > > which
> > > > > have been long-missing in MXNet.
> > > > >
> > > > > In our implementation, we have put the best efforts of keeping the
> > > > promise
> > > > > of backward compatibility in all the language bindings.
> Nevertheless,
> > > we
> > > > > still would like to call out the changes explicitly that may impact
> > > your
> > > > > existing codebases developed on top of MXNet by calling C-APIs
> > directly
> > > > or
> > > > > implementing operators in your own repos.
> > > > >
> > > > > 1. In you application, if you called any one of the following
> > > > shape-related
> > > > > C-APIs, you will need to change the data type of shape's ndim and
> > > > dim_size
> > > > > from *unsigned int* to signed *int*, because we have to use -1 to
> > > > represent
> > > > > unknown shape information, and reserve 0 for scalar and zero-size
> > > > tensors.
> > > > > One example of such changes can be seen in the cpp-package
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> > > > > >
> > > > > calling MXSymbolInferShape.
> > > > > - MXSymbolInfershape
> > > > > - MXSymbolInfershapePartial
> > > > > - MXExecutorSimpleBind
> > > > > - MXExecutorReshape
> > > > > - MXNDArrayGetShape
> > > > > - MXNDArrayCreaetFromSharedMem
> > > > >
> > > > > 2. If you have implemented operators in your own codebases, you
> will
> > > > > probably need to change every operator's shape inference function
> to
> > > use
> > > > > the following util functions to check whether shape information is
> > > known,
> > > > > instead of checking against 0 directly. One example of such changes
> > can
> > > > be
> > > > > seen in the shape inference function
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-afa640c4653c59f00f43a84455f91ef9R35
> > > > > >
> > > > > of concat operator.
> > > > > - shape_is_known (include/mxnet/tuple.h)
> > > > > - ndim_is_known (include/mxnet/tuple.h)
> > > > > - dim_size_is_known (include/mxnet/tuple.h)
> > > > >
> > > > > If you are interested in knowing the value of scalar tensors, and
> > hence
> > > > > understanding our motivation further, this thread
> > > > > <
> > https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> > > >
> > > > of
> > > > > discussion provides very good 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Anirudh Subramanian
I was under the impression that C API does fall under semver. Has this been
discussed somewhere before ? Is this also the case for C Predict API ?

On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu 
wrote:

> In case only changes to the c-api are being made, it doesn't fall under our
> semantic versioning since that's not a user facing API and thus I'd be in
> favour as doing it as part of a minor release. If there is any behavioural
> change from a user perspective (a good indicator would be if tests have to
> be changed as reaction to the Backend changes), then I'd prefer a major
> release.
>
> I'd slightly prefer a minor release since this change touches quite a few
> parts and could risk being outdated/diverged as the time until 2.0
> progresses.
>
> -Marco
>
> Aaron Markham  schrieb am Do., 11. Apr. 2019,
> 16:28:
>
> > Just curious about when this kind of change will land. Would it wait for
> > 2.0 or would it be in 1.5 or another minor release?
> >
> > On Thu, Apr 11, 2019, 00:15 Junru Shao  wrote:
> >
> > > Really nice improvement over MXNet's usability! I suggest that we could
> > > make numpy-compatible behavior default in 2.0.
> > >
> > > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu  wrote:
> > >
> > > > Dear Community,
> > > >
> > > > A while ago, we sent out an RFC
> > > >  discussing
> > the
> > > > initiative introducing NumPy compatibility into MXNet. As the first
> > > outcome
> > > > of this initiative, we submitted the PR
> > > >  providing the
> > > > infrastructure of supporting zero-dim (scalar) and zero-size tensors,
> > > which
> > > > have been long-missing in MXNet.
> > > >
> > > > In our implementation, we have put the best efforts of keeping the
> > > promise
> > > > of backward compatibility in all the language bindings. Nevertheless,
> > we
> > > > still would like to call out the changes explicitly that may impact
> > your
> > > > existing codebases developed on top of MXNet by calling C-APIs
> directly
> > > or
> > > > implementing operators in your own repos.
> > > >
> > > > 1. In you application, if you called any one of the following
> > > shape-related
> > > > C-APIs, you will need to change the data type of shape's ndim and
> > > dim_size
> > > > from *unsigned int* to signed *int*, because we have to use -1 to
> > > represent
> > > > unknown shape information, and reserve 0 for scalar and zero-size
> > > tensors.
> > > > One example of such changes can be seen in the cpp-package
> > > > <
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> > > > >
> > > > calling MXSymbolInferShape.
> > > > - MXSymbolInfershape
> > > > - MXSymbolInfershapePartial
> > > > - MXExecutorSimpleBind
> > > > - MXExecutorReshape
> > > > - MXNDArrayGetShape
> > > > - MXNDArrayCreaetFromSharedMem
> > > >
> > > > 2. If you have implemented operators in your own codebases, you will
> > > > probably need to change every operator's shape inference function to
> > use
> > > > the following util functions to check whether shape information is
> > known,
> > > > instead of checking against 0 directly. One example of such changes
> can
> > > be
> > > > seen in the shape inference function
> > > > <
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-afa640c4653c59f00f43a84455f91ef9R35
> > > > >
> > > > of concat operator.
> > > > - shape_is_known (include/mxnet/tuple.h)
> > > > - ndim_is_known (include/mxnet/tuple.h)
> > > > - dim_size_is_known (include/mxnet/tuple.h)
> > > >
> > > > If you are interested in knowing the value of scalar tensors, and
> hence
> > > > understanding our motivation further, this thread
> > > > <
> https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> > >
> > > of
> > > > discussion provides very good insights from the view of data science.
> > It
> > > > was actually related to an opportunity for MXNet becoming the backend
> > of
> > > > PyMC , but somehow it didn't go
> > > > through due to missing several key features
> > > > ,
> and
> > > > scalar tensors is one of them.
> > > >
> > > > Please leave comments in the PR
> > > >  if you have
> any
> > > > concerns or suggestions of our work.
> > > >
> > > > Thank you very much for your time and consideration.
> > > >
> > > > Best,
> > > > Jun
> > > >
> > > > *References*
> > > > [1] RFC of NumPy compatibility:
> > > > https://github.com/apache/incubator-mxnet/issues/14253
> > > > [2] Pull request of supporting scalar and zero-size tensors:
> > > > https://github.com/apache/incubator-mxnet/pull/14661
> > > > [3] The value of scalar tensors from the view of data science:
> > > >
> https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> > > > 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Marco de Abreu
In case only changes to the c-api are being made, it doesn't fall under our
semantic versioning since that's not a user facing API and thus I'd be in
favour as doing it as part of a minor release. If there is any behavioural
change from a user perspective (a good indicator would be if tests have to
be changed as reaction to the Backend changes), then I'd prefer a major
release.

I'd slightly prefer a minor release since this change touches quite a few
parts and could risk being outdated/diverged as the time until 2.0
progresses.

-Marco

Aaron Markham  schrieb am Do., 11. Apr. 2019,
16:28:

> Just curious about when this kind of change will land. Would it wait for
> 2.0 or would it be in 1.5 or another minor release?
>
> On Thu, Apr 11, 2019, 00:15 Junru Shao  wrote:
>
> > Really nice improvement over MXNet's usability! I suggest that we could
> > make numpy-compatible behavior default in 2.0.
> >
> > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu  wrote:
> >
> > > Dear Community,
> > >
> > > A while ago, we sent out an RFC
> > >  discussing
> the
> > > initiative introducing NumPy compatibility into MXNet. As the first
> > outcome
> > > of this initiative, we submitted the PR
> > >  providing the
> > > infrastructure of supporting zero-dim (scalar) and zero-size tensors,
> > which
> > > have been long-missing in MXNet.
> > >
> > > In our implementation, we have put the best efforts of keeping the
> > promise
> > > of backward compatibility in all the language bindings. Nevertheless,
> we
> > > still would like to call out the changes explicitly that may impact
> your
> > > existing codebases developed on top of MXNet by calling C-APIs directly
> > or
> > > implementing operators in your own repos.
> > >
> > > 1. In you application, if you called any one of the following
> > shape-related
> > > C-APIs, you will need to change the data type of shape's ndim and
> > dim_size
> > > from *unsigned int* to signed *int*, because we have to use -1 to
> > represent
> > > unknown shape information, and reserve 0 for scalar and zero-size
> > tensors.
> > > One example of such changes can be seen in the cpp-package
> > > <
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> > > >
> > > calling MXSymbolInferShape.
> > > - MXSymbolInfershape
> > > - MXSymbolInfershapePartial
> > > - MXExecutorSimpleBind
> > > - MXExecutorReshape
> > > - MXNDArrayGetShape
> > > - MXNDArrayCreaetFromSharedMem
> > >
> > > 2. If you have implemented operators in your own codebases, you will
> > > probably need to change every operator's shape inference function to
> use
> > > the following util functions to check whether shape information is
> known,
> > > instead of checking against 0 directly. One example of such changes can
> > be
> > > seen in the shape inference function
> > > <
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-afa640c4653c59f00f43a84455f91ef9R35
> > > >
> > > of concat operator.
> > > - shape_is_known (include/mxnet/tuple.h)
> > > - ndim_is_known (include/mxnet/tuple.h)
> > > - dim_size_is_known (include/mxnet/tuple.h)
> > >
> > > If you are interested in knowing the value of scalar tensors, and hence
> > > understanding our motivation further, this thread
> > >  >
> > of
> > > discussion provides very good insights from the view of data science.
> It
> > > was actually related to an opportunity for MXNet becoming the backend
> of
> > > PyMC , but somehow it didn't go
> > > through due to missing several key features
> > > , and
> > > scalar tensors is one of them.
> > >
> > > Please leave comments in the PR
> > >  if you have any
> > > concerns or suggestions of our work.
> > >
> > > Thank you very much for your time and consideration.
> > >
> > > Best,
> > > Jun
> > >
> > > *References*
> > > [1] RFC of NumPy compatibility:
> > > https://github.com/apache/incubator-mxnet/issues/14253
> > > [2] Pull request of supporting scalar and zero-size tensors:
> > > https://github.com/apache/incubator-mxnet/pull/14661
> > > [3] The value of scalar tensors from the view of data science:
> > > https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> > > [4] Previous discussion for MXNet becoming the backend of PyMC:
> > > https://discuss.mxnet.io/t/moving-pymc3-from-theano-to-mxnet/86
> > >
> >
>


Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Aaron Markham
Just curious about when this kind of change will land. Would it wait for
2.0 or would it be in 1.5 or another minor release?

On Thu, Apr 11, 2019, 00:15 Junru Shao  wrote:

> Really nice improvement over MXNet's usability! I suggest that we could
> make numpy-compatible behavior default in 2.0.
>
> On Wed, Apr 10, 2019 at 11:34 PM Jun Wu  wrote:
>
> > Dear Community,
> >
> > A while ago, we sent out an RFC
> >  discussing the
> > initiative introducing NumPy compatibility into MXNet. As the first
> outcome
> > of this initiative, we submitted the PR
> >  providing the
> > infrastructure of supporting zero-dim (scalar) and zero-size tensors,
> which
> > have been long-missing in MXNet.
> >
> > In our implementation, we have put the best efforts of keeping the
> promise
> > of backward compatibility in all the language bindings. Nevertheless, we
> > still would like to call out the changes explicitly that may impact your
> > existing codebases developed on top of MXNet by calling C-APIs directly
> or
> > implementing operators in your own repos.
> >
> > 1. In you application, if you called any one of the following
> shape-related
> > C-APIs, you will need to change the data type of shape's ndim and
> dim_size
> > from *unsigned int* to signed *int*, because we have to use -1 to
> represent
> > unknown shape information, and reserve 0 for scalar and zero-size
> tensors.
> > One example of such changes can be seen in the cpp-package
> > <
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> > >
> > calling MXSymbolInferShape.
> > - MXSymbolInfershape
> > - MXSymbolInfershapePartial
> > - MXExecutorSimpleBind
> > - MXExecutorReshape
> > - MXNDArrayGetShape
> > - MXNDArrayCreaetFromSharedMem
> >
> > 2. If you have implemented operators in your own codebases, you will
> > probably need to change every operator's shape inference function to use
> > the following util functions to check whether shape information is known,
> > instead of checking against 0 directly. One example of such changes can
> be
> > seen in the shape inference function
> > <
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-afa640c4653c59f00f43a84455f91ef9R35
> > >
> > of concat operator.
> > - shape_is_known (include/mxnet/tuple.h)
> > - ndim_is_known (include/mxnet/tuple.h)
> > - dim_size_is_known (include/mxnet/tuple.h)
> >
> > If you are interested in knowing the value of scalar tensors, and hence
> > understanding our motivation further, this thread
> > 
> of
> > discussion provides very good insights from the view of data science. It
> > was actually related to an opportunity for MXNet becoming the backend of
> > PyMC , but somehow it didn't go
> > through due to missing several key features
> > , and
> > scalar tensors is one of them.
> >
> > Please leave comments in the PR
> >  if you have any
> > concerns or suggestions of our work.
> >
> > Thank you very much for your time and consideration.
> >
> > Best,
> > Jun
> >
> > *References*
> > [1] RFC of NumPy compatibility:
> > https://github.com/apache/incubator-mxnet/issues/14253
> > [2] Pull request of supporting scalar and zero-size tensors:
> > https://github.com/apache/incubator-mxnet/pull/14661
> > [3] The value of scalar tensors from the view of data science:
> > https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> > [4] Previous discussion for MXNet becoming the backend of PyMC:
> > https://discuss.mxnet.io/t/moving-pymc3-from-theano-to-mxnet/86
> >
>


Re: MXNet Berlin User Group

2019-04-11 Thread Per da Silva
In addition to talks, we could also consider some mob programming sessions?
Maybe use the time to (as a small group) tackle some of the open issues?
Maybe this also helps new member to get up to speed with the code and
tooling, etc. as well as develop the community by solving problems together
=)

On Thu, Apr 11, 2019 at 11:01 AM Jose Luis Contreras Santos <
joseluis.contreras.san...@gmail.com> wrote:

> We didn't have any attendees this time either. As Chance says, I believe we
> need to rethink these users groups, we very rarely have any users as they
> stand right now. The idea of having talks from contributors sounds like an
> interesting one to me.
>
> Jose
>
> El jue., 11 abr. 2019 a las 10:14, Chance Bair ()
> escribió:
>
> > I think it would drive attendance if we had a quick talk each week from
> an
> > interested contributor.  We could start a sign up list for topics and
> > issues.
> >
> > Chance Bair
> >
> >
> >
> > On Thu, Apr 11, 2019 at 9:29 AM Isabel Drost-Fromm 
> > wrote:
> >
> >>
> >>
> >> Am 9. April 2019 17:56:21 MESZ schrieb Jose Luis Contreras Santos <
> >> joseluis.contreras.san...@gmail.com>:
> >> >This is a friendly reminder that the MXNet Berlin User Group will be
> >> >held
> >> >today, starting in a few minutes at 6pm-7pm (CEST) / 9am-10am (PST).
> >>
> >> Would you mind providing a brief summary of the user group here? How
> many
> >> ppl attended, what were the most interesting topics discussed?
> >>
> >> Going forward, please make sure to post this summary here. It may help
> >> others understand why the meetup would be interesting to attend. It may
> >> also help understand what downstream users and contributors would like
> to
> >> see in the meetup, whether or not a pre defined agenda would make sense,
> >> which frequency works best for people, whether or not the offer is
> needed
> >> at all.
> >>
> >> Isabel
> >>
> >>
> >> --
> >> This message was sent with K-9 from a mobile device with swipe to type
> >> enabled. I'm sorry for any embarrassing typos that slipped through.
> >>
> >
>


Re: MXNet Berlin User Group

2019-04-11 Thread Jose Luis Contreras Santos
We didn't have any attendees this time either. As Chance says, I believe we
need to rethink these users groups, we very rarely have any users as they
stand right now. The idea of having talks from contributors sounds like an
interesting one to me.

Jose

El jue., 11 abr. 2019 a las 10:14, Chance Bair ()
escribió:

> I think it would drive attendance if we had a quick talk each week from an
> interested contributor.  We could start a sign up list for topics and
> issues.
>
> Chance Bair
>
>
>
> On Thu, Apr 11, 2019 at 9:29 AM Isabel Drost-Fromm 
> wrote:
>
>>
>>
>> Am 9. April 2019 17:56:21 MESZ schrieb Jose Luis Contreras Santos <
>> joseluis.contreras.san...@gmail.com>:
>> >This is a friendly reminder that the MXNet Berlin User Group will be
>> >held
>> >today, starting in a few minutes at 6pm-7pm (CEST) / 9am-10am (PST).
>>
>> Would you mind providing a brief summary of the user group here? How many
>> ppl attended, what were the most interesting topics discussed?
>>
>> Going forward, please make sure to post this summary here. It may help
>> others understand why the meetup would be interesting to attend. It may
>> also help understand what downstream users and contributors would like to
>> see in the meetup, whether or not a pre defined agenda would make sense,
>> which frequency works best for people, whether or not the offer is needed
>> at all.
>>
>> Isabel
>>
>>
>> --
>> This message was sent with K-9 from a mobile device with swipe to type
>> enabled. I'm sorry for any embarrassing typos that slipped through.
>>
>


Re: MXNet Berlin User Group

2019-04-11 Thread Chance Bair
I think it would drive attendance if we had a quick talk each week from an
interested contributor.  We could start a sign up list for topics and
issues.

Chance Bair



On Thu, Apr 11, 2019 at 9:29 AM Isabel Drost-Fromm 
wrote:

>
>
> Am 9. April 2019 17:56:21 MESZ schrieb Jose Luis Contreras Santos <
> joseluis.contreras.san...@gmail.com>:
> >This is a friendly reminder that the MXNet Berlin User Group will be
> >held
> >today, starting in a few minutes at 6pm-7pm (CEST) / 9am-10am (PST).
>
> Would you mind providing a brief summary of the user group here? How many
> ppl attended, what were the most interesting topics discussed?
>
> Going forward, please make sure to post this summary here. It may help
> others understand why the meetup would be interesting to attend. It may
> also help understand what downstream users and contributors would like to
> see in the meetup, whether or not a pre defined agenda would make sense,
> which frequency works best for people, whether or not the offer is needed
> at all.
>
> Isabel
>
>
> --
> This message was sent with K-9 from a mobile device with swipe to type
> enabled. I'm sorry for any embarrassing typos that slipped through.
>


Re: MXNet Berlin User Group

2019-04-11 Thread Isabel Drost-Fromm



Am 9. April 2019 17:56:21 MESZ schrieb Jose Luis Contreras Santos 
:
>This is a friendly reminder that the MXNet Berlin User Group will be
>held
>today, starting in a few minutes at 6pm-7pm (CEST) / 9am-10am (PST).

Would you mind providing a brief summary of the user group here? How many ppl 
attended, what were the most interesting topics discussed?

Going forward, please make sure to post this summary here. It may help others 
understand why the meetup would be interesting to attend. It may also help 
understand what downstream users and contributors would like to see in the 
meetup, whether or not a pre defined agenda would make sense, which frequency 
works best for people, whether or not the offer is needed at all.

Isabel


-- 
This message was sent with K-9 from a mobile device with swipe to type enabled. 
I'm sorry for any embarrassing typos that slipped through.


Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Junru Shao
Really nice improvement over MXNet's usability! I suggest that we could
make numpy-compatible behavior default in 2.0.

On Wed, Apr 10, 2019 at 11:34 PM Jun Wu  wrote:

> Dear Community,
>
> A while ago, we sent out an RFC
>  discussing the
> initiative introducing NumPy compatibility into MXNet. As the first outcome
> of this initiative, we submitted the PR
>  providing the
> infrastructure of supporting zero-dim (scalar) and zero-size tensors, which
> have been long-missing in MXNet.
>
> In our implementation, we have put the best efforts of keeping the promise
> of backward compatibility in all the language bindings. Nevertheless, we
> still would like to call out the changes explicitly that may impact your
> existing codebases developed on top of MXNet by calling C-APIs directly or
> implementing operators in your own repos.
>
> 1. In you application, if you called any one of the following shape-related
> C-APIs, you will need to change the data type of shape's ndim and dim_size
> from *unsigned int* to signed *int*, because we have to use -1 to represent
> unknown shape information, and reserve 0 for scalar and zero-size tensors.
> One example of such changes can be seen in the cpp-package
> <
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> >
> calling MXSymbolInferShape.
> - MXSymbolInfershape
> - MXSymbolInfershapePartial
> - MXExecutorSimpleBind
> - MXExecutorReshape
> - MXNDArrayGetShape
> - MXNDArrayCreaetFromSharedMem
>
> 2. If you have implemented operators in your own codebases, you will
> probably need to change every operator's shape inference function to use
> the following util functions to check whether shape information is known,
> instead of checking against 0 directly. One example of such changes can be
> seen in the shape inference function
> <
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-afa640c4653c59f00f43a84455f91ef9R35
> >
> of concat operator.
> - shape_is_known (include/mxnet/tuple.h)
> - ndim_is_known (include/mxnet/tuple.h)
> - dim_size_is_known (include/mxnet/tuple.h)
>
> If you are interested in knowing the value of scalar tensors, and hence
> understanding our motivation further, this thread
>  of
> discussion provides very good insights from the view of data science. It
> was actually related to an opportunity for MXNet becoming the backend of
> PyMC , but somehow it didn't go
> through due to missing several key features
> , and
> scalar tensors is one of them.
>
> Please leave comments in the PR
>  if you have any
> concerns or suggestions of our work.
>
> Thank you very much for your time and consideration.
>
> Best,
> Jun
>
> *References*
> [1] RFC of NumPy compatibility:
> https://github.com/apache/incubator-mxnet/issues/14253
> [2] Pull request of supporting scalar and zero-size tensors:
> https://github.com/apache/incubator-mxnet/pull/14661
> [3] The value of scalar tensors from the view of data science:
> https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> [4] Previous discussion for MXNet becoming the backend of PyMC:
> https://discuss.mxnet.io/t/moving-pymc3-from-theano-to-mxnet/86
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-10 Thread Sheng Zha
Relay is NNVM v2. The main difference between NNVM and Relay is that the former 
can represent control flow graph. Translating the suggested optimization pass 
in this thread from NNVM to relay should be straightforward. Given that I’d 
also suggest to start early with NNVM.

-sz

> On Apr 10, 2019, at 8:26 AM, Lv, Tao A  wrote:
> 
> 
> @Tianqi,
> 
> Thank you for the information. I will take a look on that to see if we can 
> take some advantages from it.
> 
> @Junru,
> 
> The reason for why we want to hold this change to 2.0 is that we know there 
> is a discussion in TVM community that NNVM will be deprecated soon and then I 
> think MXNet has to move to a new IR either NNVM v2 or Relay. As most changes 
> in this proposal are related to IR passes, we definitely don't want to spend 
> much effort on something which is deprecating. 2.0 seems to be a more 
> appropriate timing for us to make these changes. But I agree with you, we can 
> start to do some experiments on the existing architects and NNVM IR.
> 
> -tao
> 
> -Original Message-
> From: Junru Shao [mailto:junrushao1...@gmail.com] 
> Sent: Wednesday, April 10, 2019 1:34 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and 
> memory planning pass
> 
> Agreed with Tianqi that we could have better implementation once we have 
> better tvm nnvm v2 integration. For now I believe that we shouldn't block the 
> development of Intel folks.
> 
> On Tue, Apr 9, 2019 at 10:10 PM Tianqi Chen 
> wrote:
> 
>> Such kind of conversion can be viewed as an enhanced version of 
>> AlterOpLayout in the TVM relay Pass
>> 
>>> On Tue, Apr 9, 2019 at 8:03 PM Lv, Tao A  wrote:
>>> 
>>> 
>>> Thank you Tianqi and Sam for the kind suggestions.
>>> 
>>> @Tianqi,
>>> 
>>> Can you please point me to the code of this pass or do you think 
>>> anyone from TVM community can help to educate me on this? I'm very 
>>> happy to
>> learn
>>> from that.
>>> 
>>> Just one note, we are not only doing layout transformation but also 
>>> want to have more memory for layout transformation.
>>> For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, 
>>> C=16, H=256, W=256) on channel dimension then convert (N=32, C=16, 
>>> H=256,
>> W=256)
>>> to nchw16c so we can leverage corresponding optimal computation kernels.
>>> That's why we also need changes to the memory planning pass.
>>> 
>>> 
>>> @Sam,
>>> 
>>> Yes, definitely we're treating MKL-DNN as an accelerator on CPU.
>>> Previously we used it to accelerate certain critical operators in 
>>> MXNet
>> in
>>> certain situations, eg. FP32 
>>> convolution/deconvolution/fullyConnected,
>> etc.
>>> But along with the evolving of both MXNet and MKL-DNN, we started to 
>>> do more which might not supported by MXNet in original CPU 
>>> implementation, such as quantization and graph fusion. So MKL-DNN 
>>> backend is also
>> changing
>>> from a simple `accelerator` to a `default` backend on CPU. And I 
>>> totally agree with you that we need think more about the software 
>>> architecture
>> for
>>> maintainability, testability and readability - that's why I sent out 
>>> this proposal to get more ideas from the community.
>>> 
>>> 
>>> -tao
>>> 
>>> -Original Message-
>>> From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
>>> Sent: Wednesday, April 10, 2019 2:24 AM
>>> To: dev@mxnet.incubator.apache.org
>>> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the 
>>> InferStorageType and memory planning pass
>>> 
>>> I agree with Tianqi. We should let MKLDNN partitipate in memory 
>>> planning by first having a separate NNVM pass and then using that 
>>> info in the regular memory planning phase.
>>> 
>>> Its starting to sound like MKLDNN should be treated like an 
>>> accelerator rather than an operator library. As it has explicit 
>>> needs and can provide acceleration when given extra capabilities in 
>>> MXNet like having input to the memory planning NNVM pass. It also 
>>> has special tensor formatting
>> needs
>>> and conversions that could be best architected in another way than 
>>> they currently are.
>>> 
>>> We need to think about how we want to architect this for 
>>> maintainability, testability, and readability.
>>>

RE: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-10 Thread Lv, Tao A

@Tianqi,

Thank you for the information. I will take a look on that to see if we can take 
some advantages from it.

@Junru,

The reason for why we want to hold this change to 2.0 is that we know there is 
a discussion in TVM community that NNVM will be deprecated soon and then I 
think MXNet has to move to a new IR either NNVM v2 or Relay. As most changes in 
this proposal are related to IR passes, we definitely don't want to spend much 
effort on something which is deprecating. 2.0 seems to be a more appropriate 
timing for us to make these changes. But I agree with you, we can start to do 
some experiments on the existing architects and NNVM IR.

-tao

-Original Message-
From: Junru Shao [mailto:junrushao1...@gmail.com] 
Sent: Wednesday, April 10, 2019 1:34 PM
To: dev@mxnet.incubator.apache.org
Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and 
memory planning pass

Agreed with Tianqi that we could have better implementation once we have better 
tvm nnvm v2 integration. For now I believe that we shouldn't block the 
development of Intel folks.

On Tue, Apr 9, 2019 at 10:10 PM Tianqi Chen 
wrote:

> Such kind of conversion can be viewed as an enhanced version of 
> AlterOpLayout in the TVM relay Pass
>
> On Tue, Apr 9, 2019 at 8:03 PM Lv, Tao A  wrote:
>
> >
> > Thank you Tianqi and Sam for the kind suggestions.
> >
> > @Tianqi,
> >
> > Can you please point me to the code of this pass or do you think 
> > anyone from TVM community can help to educate me on this? I'm very 
> > happy to
> learn
> > from that.
> >
> > Just one note, we are not only doing layout transformation but also 
> > want to have more memory for layout transformation.
> > For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, 
> > C=16, H=256, W=256) on channel dimension then convert (N=32, C=16, 
> > H=256,
> W=256)
> > to nchw16c so we can leverage corresponding optimal computation kernels.
> > That's why we also need changes to the memory planning pass.
> >
> >
> > @Sam,
> >
> > Yes, definitely we're treating MKL-DNN as an accelerator on CPU.
> > Previously we used it to accelerate certain critical operators in 
> > MXNet
> in
> > certain situations, eg. FP32 
> > convolution/deconvolution/fullyConnected,
> etc.
> > But along with the evolving of both MXNet and MKL-DNN, we started to 
> > do more which might not supported by MXNet in original CPU 
> > implementation, such as quantization and graph fusion. So MKL-DNN 
> > backend is also
> changing
> > from a simple `accelerator` to a `default` backend on CPU. And I 
> > totally agree with you that we need think more about the software 
> > architecture
> for
> > maintainability, testability and readability - that's why I sent out 
> > this proposal to get more ideas from the community.
> >
> >
> > -tao
> >
> > -Original Message-
> > From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
> > Sent: Wednesday, April 10, 2019 2:24 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the 
> > InferStorageType and memory planning pass
> >
> > I agree with Tianqi. We should let MKLDNN partitipate in memory 
> > planning by first having a separate NNVM pass and then using that 
> > info in the regular memory planning phase.
> >
> > Its starting to sound like MKLDNN should be treated like an 
> > accelerator rather than an operator library. As it has explicit 
> > needs and can provide acceleration when given extra capabilities in 
> > MXNet like having input to the memory planning NNVM pass. It also 
> > has special tensor formatting
> needs
> > and conversions that could be best architected in another way than 
> > they currently are.
> >
> > We need to think about how we want to architect this for 
> > maintainability, testability, and readability.
> >
> > Sam
> >
> >
> > > On Apr 9, 2019, at 11:11 AM, Tianqi Chen 
> > > 
> > wrote:
> > >
> > > The layout transformation should really be a separate optimization 
> > > pass rather than memory planning. As is done in the TVM stack. If 
> > > we want to do a clean slate solution, I would recommend looking 
> > > into that
> > instead.
> > >
> > > TIanqi
> > >
> > > On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> > >
> > >>
> > >>
> > >> Hi dev,
> > >>
> > >>
> > >>
> > >> As we're discussing the roadmap for MXNet 2.0, I would 

Re: CUDNN 7.5 Issues

2019-04-09 Thread Per da Silva
Hey Kellen,

I really appreciate that. Thank you!

And thanks to the community for supporting me ^^

Per


On Wed, Apr 10, 2019 at 5:53 AM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> Hey Per, just wanted to drop a line and say thanks for supporting the
> community on this one.
>
> On Tue, Apr 9, 2019 at 4:20 AM Per da Silva  wrote:
>
> > I've created an issue to track this problem:
> > https://github.com/apache/incubator-mxnet/issues/14652
> >
> > On Tue, Apr 9, 2019 at 9:07 AM Per da Silva 
> wrote:
> >
> > > Dear MXNet community,
> > >
> > > I've been trying to update the CI GPU images to CUDA 10, but the tests
> > are
> > > failing. I'm not sure why and would really appreciate some help =D
> > >
> > > I've managed, at least, to narrow down the problem to the cuDNN
> version.
> > > The current CUDA 10 images uses cuDNN version 7.5.0.56 (
> > >
> >
> https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/10.0/devel/cudnn7/Dockerfile
> > > ).
> > >
> > > I noticed that the binary in the python packages we release uses cuDNN
> > > 7.3.1.20 (
> > >
> >
> https://github.com/apache/incubator-mxnet/blob/master/tools/setup_gpu_build_tools.sh#L34
> > ),
> > > so decided to create a PR with CI updated to CUDA 10 with cuDNN
> 7.3.1.20
> > > and sure enough the tests passed (
> > > https://github.com/apache/incubator-mxnet/pull/14513).
> > >
> > > After talking with another contributer, we decided that I would try to
> > > create a PR with CUDA 10 and cuDNN 7.5 and just disable the failing
> tests
> > > (to be fixed later). But, it seems the problem is a bit more heinous. I
> > > disable one test, and another one fails...So, it might make sense to
> > reach
> > > out now and see if we can find the root cause and fix it.
> > >
> > > Some things I've sanity checked:
> > >
> > > We run the tests on g3.8xlarge instances. These instances contain Tesla
> > > M60 GPUs. The Tesla M60s have a compute capability of 5.2. CUDA 10
> > supports
> > > compute capabilities of 3.0 - 7.5 (https://en.wikipedia.org/wiki/CUDA
> ).
> > >
> > > According to the cuDNN support matrix (
> > >
> https://docs.nvidia.com/deeplearning/sdk/cudnn-support-matrix/index.html
> > ),
> > > cuDNN 7.5 is compatible with the GPU, CUDA 10, and requires driver
> > r410.48
> > > (I assume greater or equal).
> > >
> > > The AMIs running on the g3.8xlarge have CUDA 10 and driver 410.73.
> > >
> > > So, as best I can tell, our environment ought to support cuDNN 7.5,
> which
> > > leads me to conclude that maybe there's something wrong in the code.
> > >
> > > The errors are always: "src/operator/./cudnn_rnn-inl.h:759: Check
> failed:
> > > e == CUDNN_STATUS_SUCCESS (6 vs. 0) cuDNN: CUDNN_STATUS_ARCH_MISMATCH".
> > >
> > > According to the cuDNN user guide (
> > >
> >
> https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html
> > > ):
> > >
> > > CUDNN_STATUS_ARCH_MISMATCH
> > >
> > > The function requires a feature absent from the current GPU device.
> Note
> > > that cuDNN only supports devices with compute capabilities greater than
> > or
> > > equal to 3.0.
> > >
> > > To correct: compile and run the application on a device with
> appropriate
> > > compute capability.
> > >
> > > But, as we've seen, our environment seems to support this version of
> > cuDNN
> > > and other versions go through CI w/o any problem...
> > >
> > > You can see some logs here:
> > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/1/pipeline/
> > >
> > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-14611/12/pipeline/
> > >
> > > I have about 13 runs of this pipeline. The errors for different runs
> can
> > > be seen by changing the number before /pipeline (e.g.
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/2/pipeline/
> > > <
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/1/pipeline/
> >
> > for
> > > the 2nd run, etc.)
> > >
> > > Thanks in advance for the help!
> > >
> > > You can reach me here or on Slack if you have any questions =D
> > >
> > > Cheers,
> > >
> > > Per
> > >
> > > P.S. I'm attaching some instructions on how to reproduce the issue at
> > home
> > > (or at least on a g3.8xlarge instance running ubuntu 16.04).
> > >
> >
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-09 Thread Junru Shao
Agreed with Tianqi that we could have better implementation once we have
better tvm nnvm v2 integration. For now I believe that we shouldn't block
the development of Intel folks.

On Tue, Apr 9, 2019 at 10:10 PM Tianqi Chen 
wrote:

> Such kind of conversion can be viewed as an enhanced version of
> AlterOpLayout in the TVM relay Pass
>
> On Tue, Apr 9, 2019 at 8:03 PM Lv, Tao A  wrote:
>
> >
> > Thank you Tianqi and Sam for the kind suggestions.
> >
> > @Tianqi,
> >
> > Can you please point me to the code of this pass or do you think anyone
> > from TVM community can help to educate me on this? I'm very happy to
> learn
> > from that.
> >
> > Just one note, we are not only doing layout transformation but also want
> > to have more memory for layout transformation.
> > For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, C=16,
> > H=256, W=256) on channel dimension then convert (N=32, C=16, H=256,
> W=256)
> > to nchw16c so we can leverage corresponding optimal computation kernels.
> > That's why we also need changes to the memory planning pass.
> >
> >
> > @Sam,
> >
> > Yes, definitely we're treating MKL-DNN as an accelerator on CPU.
> > Previously we used it to accelerate certain critical operators in MXNet
> in
> > certain situations, eg. FP32 convolution/deconvolution/fullyConnected,
> etc.
> > But along with the evolving of both MXNet and MKL-DNN, we started to do
> > more which might not supported by MXNet in original CPU implementation,
> > such as quantization and graph fusion. So MKL-DNN backend is also
> changing
> > from a simple `accelerator` to a `default` backend on CPU. And I totally
> > agree with you that we need think more about the software architecture
> for
> > maintainability, testability and readability - that's why I sent out this
> > proposal to get more ideas from the community.
> >
> >
> > -tao
> >
> > -Original Message-
> > From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
> > Sent: Wednesday, April 10, 2019 2:24 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType
> > and memory planning pass
> >
> > I agree with Tianqi. We should let MKLDNN partitipate in memory planning
> > by first having a separate NNVM pass and then using that info in the
> > regular memory planning phase.
> >
> > Its starting to sound like MKLDNN should be treated like an accelerator
> > rather than an operator library. As it has explicit needs and can provide
> > acceleration when given extra capabilities in MXNet like having input to
> > the memory planning NNVM pass. It also has special tensor formatting
> needs
> > and conversions that could be best architected in another way than they
> > currently are.
> >
> > We need to think about how we want to architect this for maintainability,
> > testability, and readability.
> >
> > Sam
> >
> >
> > > On Apr 9, 2019, at 11:11 AM, Tianqi Chen 
> > wrote:
> > >
> > > The layout transformation should really be a separate optimization
> > > pass rather than memory planning. As is done in the TVM stack. If we
> > > want to do a clean slate solution, I would recommend looking into that
> > instead.
> > >
> > > TIanqi
> > >
> > > On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> > >
> > >>
> > >>
> > >> Hi dev,
> > >>
> > >>
> > >>
> > >> As we're discussing the roadmap for MXNet 2.0, I would like to start
> > >> a thread about refining the InferStorageType and memory planning pass
> > >> in MXNet and hope it can happen as a part of the 2.0 release.
> > >>
> > >>
> > >>
> > >> Thanks to @eric-haibin-lin, part of the proposal has already been
> > >> discussed in issue #13598 [1].
> > >>
> > >>
> > >>
> > >> As mentioned in the description of issue #13598, there are several
> > >> drawbacks of the existing flow. Please allow me to quote them here:
> > >> *the selection of MKL/CPU/GPU/CUDNN implementation happens
> after
> > >> graph attribute inference and memory planning, memory planning is
> > >> thus not aware of the implementation that will be used for execution
> > >> in the future, which may result in sub-optimal result. For example,
> > >> the memory inplace option may vary depending on the accele

Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-09 Thread Tianqi Chen
Such kind of conversion can be viewed as an enhanced version of
AlterOpLayout in the TVM relay Pass

On Tue, Apr 9, 2019 at 8:03 PM Lv, Tao A  wrote:

>
> Thank you Tianqi and Sam for the kind suggestions.
>
> @Tianqi,
>
> Can you please point me to the code of this pass or do you think anyone
> from TVM community can help to educate me on this? I'm very happy to learn
> from that.
>
> Just one note, we are not only doing layout transformation but also want
> to have more memory for layout transformation.
> For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, C=16,
> H=256, W=256) on channel dimension then convert (N=32, C=16, H=256, W=256)
> to nchw16c so we can leverage corresponding optimal computation kernels.
> That's why we also need changes to the memory planning pass.
>
>
> @Sam,
>
> Yes, definitely we're treating MKL-DNN as an accelerator on CPU.
> Previously we used it to accelerate certain critical operators in MXNet in
> certain situations, eg. FP32 convolution/deconvolution/fullyConnected, etc.
> But along with the evolving of both MXNet and MKL-DNN, we started to do
> more which might not supported by MXNet in original CPU implementation,
> such as quantization and graph fusion. So MKL-DNN backend is also changing
> from a simple `accelerator` to a `default` backend on CPU. And I totally
> agree with you that we need think more about the software architecture for
> maintainability, testability and readability - that's why I sent out this
> proposal to get more ideas from the community.
>
>
> -tao
>
> -Original Message-
> From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
> Sent: Wednesday, April 10, 2019 2:24 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType
> and memory planning pass
>
> I agree with Tianqi. We should let MKLDNN partitipate in memory planning
> by first having a separate NNVM pass and then using that info in the
> regular memory planning phase.
>
> Its starting to sound like MKLDNN should be treated like an accelerator
> rather than an operator library. As it has explicit needs and can provide
> acceleration when given extra capabilities in MXNet like having input to
> the memory planning NNVM pass. It also has special tensor formatting needs
> and conversions that could be best architected in another way than they
> currently are.
>
> We need to think about how we want to architect this for maintainability,
> testability, and readability.
>
> Sam
>
>
> > On Apr 9, 2019, at 11:11 AM, Tianqi Chen 
> wrote:
> >
> > The layout transformation should really be a separate optimization
> > pass rather than memory planning. As is done in the TVM stack. If we
> > want to do a clean slate solution, I would recommend looking into that
> instead.
> >
> > TIanqi
> >
> > On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> >
> >>
> >>
> >> Hi dev,
> >>
> >>
> >>
> >> As we're discussing the roadmap for MXNet 2.0, I would like to start
> >> a thread about refining the InferStorageType and memory planning pass
> >> in MXNet and hope it can happen as a part of the 2.0 release.
> >>
> >>
> >>
> >> Thanks to @eric-haibin-lin, part of the proposal has already been
> >> discussed in issue #13598 [1].
> >>
> >>
> >>
> >> As mentioned in the description of issue #13598, there are several
> >> drawbacks of the existing flow. Please allow me to quote them here:
> >> *the selection of MKL/CPU/GPU/CUDNN implementation happens after
> >> graph attribute inference and memory planning, memory planning is
> >> thus not aware of the implementation that will be used for execution
> >> in the future, which may result in sub-optimal result. For example,
> >> the memory inplace option may vary depending on the accelerator
> >> backend (the new version of CUDNN enables x/dx inplace for
> _backward_conv).
> >> *some sparse operator need to access dtype/shape information to
> >> decide which implementation to invoke for execution, and whether to
> >> perform fallback. This information is not yet exposed in the existing
> >> infer storage type interface.
> >>
> >>
> >>
> >> Besides, the existing memory planning pass calculates and afterwards
> >> allocates memory strictly according to the input/output tensor shapes
> >> (which can be got from operators' arithmetic formulas through
> InferShape).
> >> That's not true anymore when we come to accelerators lik

Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-09 Thread Junru Shao
+1 for this proposal. Probably this is doable prior to 2.0?

While I totally agree with Tianqi that in the sense of a compiler, we
should make layout transformation a separate pass, I would love to mention
that it will be non-trivial engineering effort given the fact that our
current NNVM does not have a pass manager for optionally applying passes.
Moreover, I believe Tao's proposal is somehow equivalent to adding a new
pass in NNVM (but one with the same name).

By the way, making MKLDNN as an accelerator is a nice proposal, which I
guess could be a wish for MXNet 2.0.

On Tue, Apr 9, 2019 at 8:39 PM Zhao, Patric  wrote:

> BTW,  "maintainability, testability and readability"  is always our design
> goal from starting point of MKL-DNN integration :)
>
> > -Original Message-
> > From: Lv, Tao A [mailto:tao.a...@intel.com]
> > Sent: Wednesday, April 10, 2019 11:03 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: RE: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType
> and
> > memory planning pass
> >
> >
> > Thank you Tianqi and Sam for the kind suggestions.
> >
> > @Tianqi,
> >
> > Can you please point me to the code of this pass or do you think anyone
> > from TVM community can help to educate me on this? I'm very happy to
> > learn from that.
> >
> > Just one note, we are not only doing layout transformation but also want
> to
> > have more memory for layout transformation.
> > For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, C=16,
> > H=256, W=256) on channel dimension then convert (N=32, C=16, H=256,
> > W=256) to nchw16c so we can leverage corresponding optimal computation
> > kernels.
> > That's why we also need changes to the memory planning pass.
> >
> >
> > @Sam,
> >
> > Yes, definitely we're treating MKL-DNN as an accelerator on CPU.
> Previously
> > we used it to accelerate certain critical operators in MXNet in certain
> > situations, eg. FP32 convolution/deconvolution/fullyConnected, etc. But
> > along with the evolving of both MXNet and MKL-DNN, we started to do more
> > which might not supported by MXNet in original CPU implementation, such
> > as quantization and graph fusion. So MKL-DNN backend is also changing
> from
> > a simple `accelerator` to a `default` backend on CPU. And I totally
> agree with
> > you that we need think more about the software architecture for
> > maintainability, testability and readability - that's why I sent out
> this proposal
> > to get more ideas from the community.
> >
> >
> > -tao
> >
> > -Original Message-
> > From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
> > Sent: Wednesday, April 10, 2019 2:24 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType
> and
> > memory planning pass
> >
> > I agree with Tianqi. We should let MKLDNN partitipate in memory planning
> > by first having a separate NNVM pass and then using that info in the
> regular
> > memory planning phase.
> >
> > Its starting to sound like MKLDNN should be treated like an accelerator
> rather
> > than an operator library. As it has explicit needs and can provide
> acceleration
> > when given extra capabilities in MXNet like having input to the memory
> > planning NNVM pass. It also has special tensor formatting needs and
> > conversions that could be best architected in another way than they
> > currently are.
> >
> > We need to think about how we want to architect this for maintainability,
> > testability, and readability.
> >
> > Sam
> >
> >
> > > On Apr 9, 2019, at 11:11 AM, Tianqi Chen 
> > wrote:
> > >
> > > The layout transformation should really be a separate optimization
> > > pass rather than memory planning. As is done in the TVM stack. If we
> > > want to do a clean slate solution, I would recommend looking into that
> > instead.
> > >
> > > TIanqi
> > >
> > > On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> > >
> > >>
> > >>
> > >> Hi dev,
> > >>
> > >>
> > >>
> > >> As we're discussing the roadmap for MXNet 2.0, I would like to start
> > >> a thread about refining the InferStorageType and memory planning pass
> > >> in MXNet and hope it can happen as a part of the 2.0 release.
> > >>
> > >>
> > >>
> > >> Thanks to @eric-haibin-lin, part of the proposal has already been
>

Re: CUDNN 7.5 Issues

2019-04-09 Thread kellen sunderland
Hey Per, just wanted to drop a line and say thanks for supporting the
community on this one.

On Tue, Apr 9, 2019 at 4:20 AM Per da Silva  wrote:

> I've created an issue to track this problem:
> https://github.com/apache/incubator-mxnet/issues/14652
>
> On Tue, Apr 9, 2019 at 9:07 AM Per da Silva  wrote:
>
> > Dear MXNet community,
> >
> > I've been trying to update the CI GPU images to CUDA 10, but the tests
> are
> > failing. I'm not sure why and would really appreciate some help =D
> >
> > I've managed, at least, to narrow down the problem to the cuDNN version.
> > The current CUDA 10 images uses cuDNN version 7.5.0.56 (
> >
> https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/10.0/devel/cudnn7/Dockerfile
> > ).
> >
> > I noticed that the binary in the python packages we release uses cuDNN
> > 7.3.1.20 (
> >
> https://github.com/apache/incubator-mxnet/blob/master/tools/setup_gpu_build_tools.sh#L34
> ),
> > so decided to create a PR with CI updated to CUDA 10 with cuDNN 7.3.1.20
> > and sure enough the tests passed (
> > https://github.com/apache/incubator-mxnet/pull/14513).
> >
> > After talking with another contributer, we decided that I would try to
> > create a PR with CUDA 10 and cuDNN 7.5 and just disable the failing tests
> > (to be fixed later). But, it seems the problem is a bit more heinous. I
> > disable one test, and another one fails...So, it might make sense to
> reach
> > out now and see if we can find the root cause and fix it.
> >
> > Some things I've sanity checked:
> >
> > We run the tests on g3.8xlarge instances. These instances contain Tesla
> > M60 GPUs. The Tesla M60s have a compute capability of 5.2. CUDA 10
> supports
> > compute capabilities of 3.0 - 7.5 (https://en.wikipedia.org/wiki/CUDA).
> >
> > According to the cuDNN support matrix (
> > https://docs.nvidia.com/deeplearning/sdk/cudnn-support-matrix/index.html
> ),
> > cuDNN 7.5 is compatible with the GPU, CUDA 10, and requires driver
> r410.48
> > (I assume greater or equal).
> >
> > The AMIs running on the g3.8xlarge have CUDA 10 and driver 410.73.
> >
> > So, as best I can tell, our environment ought to support cuDNN 7.5, which
> > leads me to conclude that maybe there's something wrong in the code.
> >
> > The errors are always: "src/operator/./cudnn_rnn-inl.h:759: Check failed:
> > e == CUDNN_STATUS_SUCCESS (6 vs. 0) cuDNN: CUDNN_STATUS_ARCH_MISMATCH".
> >
> > According to the cuDNN user guide (
> >
> https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html
> > ):
> >
> > CUDNN_STATUS_ARCH_MISMATCH
> >
> > The function requires a feature absent from the current GPU device. Note
> > that cuDNN only supports devices with compute capabilities greater than
> or
> > equal to 3.0.
> >
> > To correct: compile and run the application on a device with appropriate
> > compute capability.
> >
> > But, as we've seen, our environment seems to support this version of
> cuDNN
> > and other versions go through CI w/o any problem...
> >
> > You can see some logs here:
> >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/1/pipeline/
> >
> >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-14611/12/pipeline/
> >
> > I have about 13 runs of this pipeline. The errors for different runs can
> > be seen by changing the number before /pipeline (e.g.
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/2/pipeline/
> > <
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/1/pipeline/>
> for
> > the 2nd run, etc.)
> >
> > Thanks in advance for the help!
> >
> > You can reach me here or on Slack if you have any questions =D
> >
> > Cheers,
> >
> > Per
> >
> > P.S. I'm attaching some instructions on how to reproduce the issue at
> home
> > (or at least on a g3.8xlarge instance running ubuntu 16.04).
> >
>


RE: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-09 Thread Zhao, Patric
BTW,  "maintainability, testability and readability"  is always our design goal 
from starting point of MKL-DNN integration :)

> -Original Message-
> From: Lv, Tao A [mailto:tao.a...@intel.com]
> Sent: Wednesday, April 10, 2019 11:03 AM
> To: dev@mxnet.incubator.apache.org
> Subject: RE: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and
> memory planning pass
> 
> 
> Thank you Tianqi and Sam for the kind suggestions.
> 
> @Tianqi,
> 
> Can you please point me to the code of this pass or do you think anyone
> from TVM community can help to educate me on this? I'm very happy to
> learn from that.
> 
> Just one note, we are not only doing layout transformation but also want to
> have more memory for layout transformation.
> For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, C=16,
> H=256, W=256) on channel dimension then convert (N=32, C=16, H=256,
> W=256) to nchw16c so we can leverage corresponding optimal computation
> kernels.
> That's why we also need changes to the memory planning pass.
> 
> 
> @Sam,
> 
> Yes, definitely we're treating MKL-DNN as an accelerator on CPU. Previously
> we used it to accelerate certain critical operators in MXNet in certain
> situations, eg. FP32 convolution/deconvolution/fullyConnected, etc. But
> along with the evolving of both MXNet and MKL-DNN, we started to do more
> which might not supported by MXNet in original CPU implementation, such
> as quantization and graph fusion. So MKL-DNN backend is also changing from
> a simple `accelerator` to a `default` backend on CPU. And I totally agree with
> you that we need think more about the software architecture for
> maintainability, testability and readability - that's why I sent out this 
> proposal
> to get more ideas from the community.
> 
> 
> -tao
> 
> -Original Message-
> From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID]
> Sent: Wednesday, April 10, 2019 2:24 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and
> memory planning pass
> 
> I agree with Tianqi. We should let MKLDNN partitipate in memory planning
> by first having a separate NNVM pass and then using that info in the regular
> memory planning phase.
> 
> Its starting to sound like MKLDNN should be treated like an accelerator rather
> than an operator library. As it has explicit needs and can provide 
> acceleration
> when given extra capabilities in MXNet like having input to the memory
> planning NNVM pass. It also has special tensor formatting needs and
> conversions that could be best architected in another way than they
> currently are.
> 
> We need to think about how we want to architect this for maintainability,
> testability, and readability.
> 
> Sam
> 
> 
> > On Apr 9, 2019, at 11:11 AM, Tianqi Chen 
> wrote:
> >
> > The layout transformation should really be a separate optimization
> > pass rather than memory planning. As is done in the TVM stack. If we
> > want to do a clean slate solution, I would recommend looking into that
> instead.
> >
> > TIanqi
> >
> > On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> >
> >>
> >>
> >> Hi dev,
> >>
> >>
> >>
> >> As we're discussing the roadmap for MXNet 2.0, I would like to start
> >> a thread about refining the InferStorageType and memory planning pass
> >> in MXNet and hope it can happen as a part of the 2.0 release.
> >>
> >>
> >>
> >> Thanks to @eric-haibin-lin, part of the proposal has already been
> >> discussed in issue #13598 [1].
> >>
> >>
> >>
> >> As mentioned in the description of issue #13598, there are several
> >> drawbacks of the existing flow. Please allow me to quote them here:
> >> *the selection of MKL/CPU/GPU/CUDNN implementation happens
> after
> >> graph attribute inference and memory planning, memory planning is
> >> thus not aware of the implementation that will be used for execution
> >> in the future, which may result in sub-optimal result. For example,
> >> the memory inplace option may vary depending on the accelerator
> >> backend (the new version of CUDNN enables x/dx inplace for
> _backward_conv).
> >> *some sparse operator need to access dtype/shape information to
> >> decide which implementation to invoke for execution, and whether to
> >> perform fallback. This information is not yet exposed in the existing
> >> infer storage type interface.
> >>
> >>
> >>
> >> Besides, the 

RE: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-09 Thread Lv, Tao A


Thank you Tianqi and Sam for the kind suggestions.

@Tianqi,

Can you please point me to the code of this pass or do you think anyone from 
TVM community can help to educate me on this? I'm very happy to learn from that.

Just one note, we are not only doing layout transformation but also want to 
have more memory for layout transformation.
For example, (N=32, C=3, H=256, W=256) will be padded to (N=32, C=16, H=256, 
W=256) on channel dimension then convert (N=32, C=16, H=256, W=256) to nchw16c 
so we can leverage corresponding optimal computation kernels.
That's why we also need changes to the memory planning pass.


@Sam,

Yes, definitely we're treating MKL-DNN as an accelerator on CPU. Previously we 
used it to accelerate certain critical operators in MXNet in certain 
situations, eg. FP32 convolution/deconvolution/fullyConnected, etc. But along 
with the evolving of both MXNet and MKL-DNN, we started to do more which might 
not supported by MXNet in original CPU implementation, such as quantization and 
graph fusion. So MKL-DNN backend is also changing from a simple `accelerator` 
to a `default` backend on CPU. And I totally agree with you that we need think 
more about the software architecture for maintainability, testability and 
readability - that's why I sent out this proposal to get more ideas from the 
community.


-tao

-Original Message-
From: Skalicky, Sam [mailto:sska...@amazon.com.INVALID] 
Sent: Wednesday, April 10, 2019 2:24 AM
To: dev@mxnet.incubator.apache.org
Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and 
memory planning pass

I agree with Tianqi. We should let MKLDNN partitipate in memory planning by 
first having a separate NNVM pass and then using that info in the regular 
memory planning phase.

Its starting to sound like MKLDNN should be treated like an accelerator rather 
than an operator library. As it has explicit needs and can provide acceleration 
when given extra capabilities in MXNet like having input to the memory planning 
NNVM pass. It also has special tensor formatting needs and conversions that 
could be best architected in another way than they currently are.

We need to think about how we want to architect this for maintainability, 
testability, and readability.

Sam


> On Apr 9, 2019, at 11:11 AM, Tianqi Chen  wrote:
> 
> The layout transformation should really be a separate optimization 
> pass rather than memory planning. As is done in the TVM stack. If we 
> want to do a clean slate solution, I would recommend looking into that 
> instead.
> 
> TIanqi
> 
> On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> 
>> 
>> 
>> Hi dev,
>> 
>> 
>> 
>> As we're discussing the roadmap for MXNet 2.0, I would like to start 
>> a thread about refining the InferStorageType and memory planning pass 
>> in MXNet and hope it can happen as a part of the 2.0 release.
>> 
>> 
>> 
>> Thanks to @eric-haibin-lin, part of the proposal has already been 
>> discussed in issue #13598 [1].
>> 
>> 
>> 
>> As mentioned in the description of issue #13598, there are several 
>> drawbacks of the existing flow. Please allow me to quote them here:
>> *the selection of MKL/CPU/GPU/CUDNN implementation happens after
>> graph attribute inference and memory planning, memory planning is 
>> thus not aware of the implementation that will be used for execution 
>> in the future, which may result in sub-optimal result. For example, 
>> the memory inplace option may vary depending on the accelerator 
>> backend (the new version of CUDNN enables x/dx inplace for _backward_conv).
>> *some sparse operator need to access dtype/shape information to
>> decide which implementation to invoke for execution, and whether to 
>> perform fallback. This information is not yet exposed in the existing 
>> infer storage type interface.
>> 
>> 
>> 
>> Besides, the existing memory planning pass calculates and afterwards 
>> allocates memory strictly according to the input/output tensor shapes 
>> (which can be got from operators' arithmetic formulas through InferShape).
>> That's not true anymore when we come to accelerators like MKL-DNN on 
>> CPU which wants to pad input/output tensor to optimal formats (eg. 
>> nchw16c) according to hardware architecture. It also can be described 
>> as shape + stride. As many of you know, MKL-DNN shows great 
>> performance on these optimal formats which is blocked by the vector length 
>> of AVX512 or AVX2.
>> It's very natural for us to pad on the channel dimension for those 
>> inputs/outputs which IC or OC is not multiples of vector length and 
>> leverage optimal kernels for blocked formats. Unfortunately this 
>> cannot be imp

Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-09 Thread Skalicky, Sam
I agree with Tianqi. We should let MKLDNN partitipate in memory planning by 
first having a separate NNVM pass and then using that info in the regular 
memory planning phase.

Its starting to sound like MKLDNN should be treated like an accelerator rather 
than an operator library. As it has explicit needs and can provide acceleration 
when given extra capabilities in MXNet like having input to the memory planning 
NNVM pass. It also has special tensor formatting needs and conversions that 
could be best architected in another way than they currently are.

We need to think about how we want to architect this for maintainability, 
testability, and readability.

Sam


> On Apr 9, 2019, at 11:11 AM, Tianqi Chen  wrote:
> 
> The layout transformation should really be a separate optimization pass
> rather than memory planning. As is done in the TVM stack. If we want to do
> a clean slate solution, I would recommend looking into that instead.
> 
> TIanqi
> 
> On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:
> 
>> 
>> 
>> Hi dev,
>> 
>> 
>> 
>> As we're discussing the roadmap for MXNet 2.0, I would like to start a
>> thread about refining the InferStorageType and memory planning pass in
>> MXNet and hope it can happen as a part of the 2.0 release.
>> 
>> 
>> 
>> Thanks to @eric-haibin-lin, part of the proposal has already been
>> discussed in issue #13598 [1].
>> 
>> 
>> 
>> As mentioned in the description of issue #13598, there are several
>> drawbacks of the existing flow. Please allow me to quote them here:
>> *the selection of MKL/CPU/GPU/CUDNN implementation happens after
>> graph attribute inference and memory planning, memory planning is thus not
>> aware of the implementation that will be used for execution in the future,
>> which may result in sub-optimal result. For example, the memory inplace
>> option may vary depending on the accelerator backend (the new version of
>> CUDNN enables x/dx inplace for _backward_conv).
>> *some sparse operator need to access dtype/shape information to
>> decide which implementation to invoke for execution, and whether to perform
>> fallback. This information is not yet exposed in the existing infer storage
>> type interface.
>> 
>> 
>> 
>> Besides, the existing memory planning pass calculates and afterwards
>> allocates memory strictly according to the input/output tensor shapes
>> (which can be got from operators' arithmetic formulas through InferShape).
>> That's not true anymore when we come to accelerators like MKL-DNN on CPU
>> which wants to pad input/output tensor to optimal formats (eg. nchw16c)
>> according to hardware architecture. It also can be described as shape +
>> stride. As many of you know, MKL-DNN shows great performance on these
>> optimal formats which is blocked by the vector length of AVX512 or AVX2.
>> It's very natural for us to pad on the channel dimension for those
>> inputs/outputs which IC or OC is not multiples of vector length and
>> leverage optimal kernels for blocked formats. Unfortunately this cannot be
>> implemented without changing the logic in the memory planning pass.
>> Currently we always fallback to slow reference kernels for both convolution
>> [1] and deconvolution [2].
>> 
>> 
>> 
>> AFAIK, the padding feature of MKL-DNN has already been used in TensorFlow
>> and other frameworks. We also found that, without supporting this feature,
>> many other new features from MKL-DNN cannot be applied to MXNet,  such as
>> the deconvolution primitive, winograd, etc.
>> 
>> 
>> 
>> Changes for this proposal can be divided into following parts:
>> 1.  Following the proposal in issue #13598, we need add new
>> InferStorageTypeEx functions to operators which need to do dispatch in a
>> more fine-grained way. This also need the InfereStorage pass can handle the
>> new -Ex function as what we did for FCompute and FComputeEx.
>> 2.  Attach more information to the computation graph/node, eg.
>> accelerator specific information. Currently we add `IsMKLDNN` directly
>> during operator registration if MXNET_USE_MKLDNN == 1. It looks simple and
>> rude to me.
>> 3.  Do memory planning according to more information: topology,
>> shapes, data types, in-place options and more accurate accelerator
>> information (accelerator path, memory size requirements, accelerator-wise
>> attributes).
>> 4.  Improve MKL-DNN operators so they can work on those well planned
>> memory which may be larger than the arithmetic requirements and work with
>> optimal kernels. Also, with more accurate dispatching in
>> InferStorageTypeEx, there is no need for us to write complicated fallback
>> logic in MKL-DNN operators.
>> 5.  If users feel uncomfortable with more memory usage, we can disable
>> this feature by environmental variables.
>> 
>> 
>> 
>> Since the memory planning pass is implemented in NNVM, so we also need
>> support from TVM community.
>> 
>> 
>> 
>> Please let me know what do you think. Thank you.
>> 
>> 
>> 
>> -tao
>> 

Re: [MXNET 2.0 Wishlist] [DISCUSS] Refine the InferStorageType and memory planning pass

2019-04-09 Thread Tianqi Chen
The layout transformation should really be a separate optimization pass
rather than memory planning. As is done in the TVM stack. If we want to do
a clean slate solution, I would recommend looking into that instead.

TIanqi

On Tue, Apr 9, 2019 at 1:46 AM Lv, Tao A  wrote:

>
>
> Hi dev,
>
>
>
> As we're discussing the roadmap for MXNet 2.0, I would like to start a
> thread about refining the InferStorageType and memory planning pass in
> MXNet and hope it can happen as a part of the 2.0 release.
>
>
>
> Thanks to @eric-haibin-lin, part of the proposal has already been
> discussed in issue #13598 [1].
>
>
>
> As mentioned in the description of issue #13598, there are several
> drawbacks of the existing flow. Please allow me to quote them here:
> *the selection of MKL/CPU/GPU/CUDNN implementation happens after
> graph attribute inference and memory planning, memory planning is thus not
> aware of the implementation that will be used for execution in the future,
> which may result in sub-optimal result. For example, the memory inplace
> option may vary depending on the accelerator backend (the new version of
> CUDNN enables x/dx inplace for _backward_conv).
> *some sparse operator need to access dtype/shape information to
> decide which implementation to invoke for execution, and whether to perform
> fallback. This information is not yet exposed in the existing infer storage
> type interface.
>
>
>
> Besides, the existing memory planning pass calculates and afterwards
> allocates memory strictly according to the input/output tensor shapes
> (which can be got from operators' arithmetic formulas through InferShape).
> That's not true anymore when we come to accelerators like MKL-DNN on CPU
> which wants to pad input/output tensor to optimal formats (eg. nchw16c)
> according to hardware architecture. It also can be described as shape +
> stride. As many of you know, MKL-DNN shows great performance on these
> optimal formats which is blocked by the vector length of AVX512 or AVX2.
> It's very natural for us to pad on the channel dimension for those
> inputs/outputs which IC or OC is not multiples of vector length and
> leverage optimal kernels for blocked formats. Unfortunately this cannot be
> implemented without changing the logic in the memory planning pass.
> Currently we always fallback to slow reference kernels for both convolution
> [1] and deconvolution [2].
>
>
>
> AFAIK, the padding feature of MKL-DNN has already been used in TensorFlow
> and other frameworks. We also found that, without supporting this feature,
> many other new features from MKL-DNN cannot be applied to MXNet,  such as
> the deconvolution primitive, winograd, etc.
>
>
>
> Changes for this proposal can be divided into following parts:
> 1.  Following the proposal in issue #13598, we need add new
> InferStorageTypeEx functions to operators which need to do dispatch in a
> more fine-grained way. This also need the InfereStorage pass can handle the
> new -Ex function as what we did for FCompute and FComputeEx.
> 2.  Attach more information to the computation graph/node, eg.
> accelerator specific information. Currently we add `IsMKLDNN` directly
> during operator registration if MXNET_USE_MKLDNN == 1. It looks simple and
> rude to me.
> 3.  Do memory planning according to more information: topology,
> shapes, data types, in-place options and more accurate accelerator
> information (accelerator path, memory size requirements, accelerator-wise
> attributes).
> 4.  Improve MKL-DNN operators so they can work on those well planned
> memory which may be larger than the arithmetic requirements and work with
> optimal kernels. Also, with more accurate dispatching in
> InferStorageTypeEx, there is no need for us to write complicated fallback
> logic in MKL-DNN operators.
> 5.  If users feel uncomfortable with more memory usage, we can disable
> this feature by environmental variables.
>
>
>
> Since the memory planning pass is implemented in NNVM, so we also need
> support from TVM community.
>
>
>
> Please let me know what do you think. Thank you.
>
>
>
> -tao
>
>
>
> [1] https://github.com/apache/incubator-mxnet/issues/13598
>
> [2]
> https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_convolution.cc#L194
>
> [3]
> https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_deconvolution.cc#L55
>
>


Re: CUDNN 7.5 Issues

2019-04-09 Thread Per da Silva
I've created an issue to track this problem:
https://github.com/apache/incubator-mxnet/issues/14652

On Tue, Apr 9, 2019 at 9:07 AM Per da Silva  wrote:

> Dear MXNet community,
>
> I've been trying to update the CI GPU images to CUDA 10, but the tests are
> failing. I'm not sure why and would really appreciate some help =D
>
> I've managed, at least, to narrow down the problem to the cuDNN version.
> The current CUDA 10 images uses cuDNN version 7.5.0.56 (
> https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/10.0/devel/cudnn7/Dockerfile
> ).
>
> I noticed that the binary in the python packages we release uses cuDNN
> 7.3.1.20 (
> https://github.com/apache/incubator-mxnet/blob/master/tools/setup_gpu_build_tools.sh#L34),
> so decided to create a PR with CI updated to CUDA 10 with cuDNN 7.3.1.20
> and sure enough the tests passed (
> https://github.com/apache/incubator-mxnet/pull/14513).
>
> After talking with another contributer, we decided that I would try to
> create a PR with CUDA 10 and cuDNN 7.5 and just disable the failing tests
> (to be fixed later). But, it seems the problem is a bit more heinous. I
> disable one test, and another one fails...So, it might make sense to reach
> out now and see if we can find the root cause and fix it.
>
> Some things I've sanity checked:
>
> We run the tests on g3.8xlarge instances. These instances contain Tesla
> M60 GPUs. The Tesla M60s have a compute capability of 5.2. CUDA 10 supports
> compute capabilities of 3.0 - 7.5 (https://en.wikipedia.org/wiki/CUDA).
>
> According to the cuDNN support matrix (
> https://docs.nvidia.com/deeplearning/sdk/cudnn-support-matrix/index.html),
> cuDNN 7.5 is compatible with the GPU, CUDA 10, and requires driver r410.48
> (I assume greater or equal).
>
> The AMIs running on the g3.8xlarge have CUDA 10 and driver 410.73.
>
> So, as best I can tell, our environment ought to support cuDNN 7.5, which
> leads me to conclude that maybe there's something wrong in the code.
>
> The errors are always: "src/operator/./cudnn_rnn-inl.h:759: Check failed:
> e == CUDNN_STATUS_SUCCESS (6 vs. 0) cuDNN: CUDNN_STATUS_ARCH_MISMATCH".
>
> According to the cuDNN user guide (
> https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html
> ):
>
> CUDNN_STATUS_ARCH_MISMATCH
>
> The function requires a feature absent from the current GPU device. Note
> that cuDNN only supports devices with compute capabilities greater than or
> equal to 3.0.
>
> To correct: compile and run the application on a device with appropriate
> compute capability.
>
> But, as we've seen, our environment seems to support this version of cuDNN
> and other versions go through CI w/o any problem...
>
> You can see some logs here:
>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/1/pipeline/
>
>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-14611/12/pipeline/
>
> I have about 13 runs of this pipeline. The errors for different runs can
> be seen by changing the number before /pipeline (e.g.
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14611/2/pipeline/
> 
>  for
> the 2nd run, etc.)
>
> Thanks in advance for the help!
>
> You can reach me here or on Slack if you have any questions =D
>
> Cheers,
>
> Per
>
> P.S. I'm attaching some instructions on how to reproduce the issue at home
> (or at least on a g3.8xlarge instance running ubuntu 16.04).
>


Re: MXNet 1.4.1 Release Proposal

2019-04-08 Thread Junru Shao
Thanks for the great opportunity! Let's wait for some time for fixes and
proposals and decide the timeline then.

On Mon, Apr 8, 2019 at 1:02 PM Hagay Lupesko  wrote:

> Awesome - thanks Junru and Sheng!
> I have updated the CWiki to reflect you being the release manager and
> shepherd.
>
> Junru - I suggest we give the community a week more to add critical fix
> proposals, before we set a timeline. Please feel free to drive this
> forward, and I'm happy to help as needed.
>
> Thanks everyone,
> Hagay
>
> On Thu, Apr 4, 2019 at 2:27 PM Sheng Zha  wrote:
>
> > Thanks Hagay for proposing the release and for Junru to volunteer to
> drive
> > the release. I will help Junru as the committer for this release.
> >
> > -sz
> >
> > On Thu, Apr 4, 2019 at 2:18 PM Junru Shao 
> wrote:
> >
> > > Hi Hagay,
> > >
> > > I have some experiences in MXNet development, and would love to
> volunteer
> > > for driving this release.
> > >
> > > Thank you so much!
> > >
> > > Best,
> > > Junru
> > >
> > > On Thu, Apr 4, 2019 at 1:51 PM Hagay Lupesko 
> wrote:
> > >
> > > > Hello MXNet community,
> > > >
> > > > As previously discussed in [0
> > > > <
> > > >
> > >
> >
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> > > > >],
> > > > and per the feedback from Pedro, Kellen and Sheng, I'd like to
> propose
> > > > releasing MXNet 1.4.1.
> > > > MXNet 1.4.1 is a patch release on top of 1.4.0 (following semver[1
> > > > ]), that includes backwards compatible bug
> fixes
> > -
> > > a
> > > > couple I am aware of are mem leaks in Scala API, Gluon RNN and
> > NDArrays.
> > > >
> > > > I went ahead and created a draft release page on CWiki [2
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> > > > >],
> > > > thanks to Yuxi Hu for adding a mem leak fix, and thanks to Andrew
> > Ayres,
> > > > Qing Lan and Sergey Sokolov for fixing bugs in 1.4.0 - I went ahead
> and
> > > > added your fixes to the list.
> > > >
> > > > Asking the community to:
> > > > (1) Any bug fix or regression you identified and fixed after 1.4.0
> > > release?
> > > > please add it to the release proposal wiki (or msg me on Slack if you
> > > don't
> > > > have write access, happy to do it).
> > > > (2) Any comments or suggestions on the release wiki? please leave
> > > comments
> > > > on the wiki or reply to this email.
> > > > (3) I am looking for volunteers to drive the release - ideally we'll
> > have
> > > > two volunteers: a non-committer and a shepherd committer that can
> also
> > > help
> > > > with the logistics that require permissions. This is a great way to
> > > > contribute to the community and help MXNet!
> > > >
> > > > I plan to check-in in a few days and finalize the proposal, so timely
> > > > response is appreciated.
> > > >
> > > > Cheers,
> > > > Hagay
> > > >
> > > > [0]
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> > > > [1] https://semver.org/
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> > > >
> > >
> >
>


Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

2019-04-08 Thread Hagay Lupesko
Agreed!
I will mention this to my colleagues at Amazon that can help with that.

On Mon, Apr 8, 2019 at 1:32 PM Chaitanya Bapat  wrote:

> Yes. Moreover, we should be pushing it on our Twitter, Reddit, Medium, etc
> social channels.
>
> On Mon, 8 Apr 2019 at 15:55, Hagay Lupesko  wrote:
>
> > That's super cool Chai - thanks for sharing!
> > I also noticed that, and was seeing how we can reach out to the Fujitsu
> > guys so they can contribute back into MXNet...
> >
> > On Mon, Apr 8, 2019 at 10:14 AM Lin Yuan  wrote:
> >
> > > Chai,
> > >
> > > Thanks for sharing. This is awesome news!
> > >
> > > Lin
> > >
> > > On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat 
> > > wrote:
> > >
> > > > Greetings!
> > > >
> > > > Great start to a Monday morning, as I came across this news on Import
> > AI,
> > > > an AI newsletter.
> > > >
> > > > The newsletter talked about Apache MXNet, hence thought of sharing it
> > > with
> > > > our community. This seems to be a great achievement worth paying
> > > attention
> > > > to.
> > > >
> > > > *75 seconds: How long it takes to train a network against ImageNet:*
> > > > *...Fujitsu Research claims state-of-the-art ImageNet training
> > scheme...*
> > > > Researchers with Fujitsu Laboratories in Japan have further reduced
> the
> > > > time it takes to train large-scale, supervised learning AI models;
> > their
> > > > approach lets them train a residual network to around 75% accuracy on
> > the
> > > > ImageNet dataset after 74.7 seconds of training time. This is a big
> > leap
> > > > from where we were in 2017 (an hour), and is impressive relative to
> > > > late-2018 performance (around 4 minutes: see issue #121
> > > > <
> > > >
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=28edafc07a=0b77acb987
> > > > >
> > > > ).
> > > >
> > > > *How they did it: *The researchers trained their system across *2,048
> > > Tesla
> > > > V100 GPUs* via the Amazon-developed MXNet deep learning framework.
> They
> > > > used a large mini-batch size of 81,920, and also implemented
> layer-wise
> > > > adaptive scaling (LARS) and a 'warming up' period to increase
> learning
> > > > efficiency.
> > > >
> > > > *Why it matters:* Training large models on distributed infrastructure
> > is
> > > a
> > > > key component of modern AI research, and the reduction in time we've
> > seen
> > > > on ImageNet training is striking - I think this is emblematic of the
> > > > industrialization of AI, as people seek to create systematic
> approaches
> > > to
> > > > efficiently training models across large amounts of computers. This
> > trend
> > > > ultimately leads to a speedup in the rate of research reliant on
> > > > large-scale experimentation, and can unlock new paths of research.
> > > > *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on
> > > ImageNet
> > > > in 74.7 seconds (Arxiv)
> > > > <
> > > >
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=d2b13c879f=0b77acb987
> > > > >
> > > > .
> > > >
> > > > NVIDIA article -
> > > >
> > > >
> > >
> >
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
> > > >
> > > > Hope that gives further impetus to strive harder!
> > > > Have a good week!
> > > > Chai
> > > >
> > > >  --
> > > > *Chaitanya Prakash Bapat*
> > > > *+1 (973) 953-6299*
> > > >
> > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > [image:
> > > https://www.facebook.com/chaibapat
> > > > ]
> > > > [image:
> > > > https://twitter.com/ChaiBapchya]  > > >[image:
> > > > https://www.linkedin.com//in/chaibapat25]
> > > > 
> > > >
> > >
> >
>
>
> --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> [image: https://www.facebook.com/chaibapat
> ]
> [image:
> https://twitter.com/ChaiBapchya] [image:
> https://www.linkedin.com//in/chaibapat25]
> 
>


Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

2019-04-08 Thread Chaitanya Bapat
Yes. Moreover, we should be pushing it on our Twitter, Reddit, Medium, etc
social channels.

On Mon, 8 Apr 2019 at 15:55, Hagay Lupesko  wrote:

> That's super cool Chai - thanks for sharing!
> I also noticed that, and was seeing how we can reach out to the Fujitsu
> guys so they can contribute back into MXNet...
>
> On Mon, Apr 8, 2019 at 10:14 AM Lin Yuan  wrote:
>
> > Chai,
> >
> > Thanks for sharing. This is awesome news!
> >
> > Lin
> >
> > On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat 
> > wrote:
> >
> > > Greetings!
> > >
> > > Great start to a Monday morning, as I came across this news on Import
> AI,
> > > an AI newsletter.
> > >
> > > The newsletter talked about Apache MXNet, hence thought of sharing it
> > with
> > > our community. This seems to be a great achievement worth paying
> > attention
> > > to.
> > >
> > > *75 seconds: How long it takes to train a network against ImageNet:*
> > > *...Fujitsu Research claims state-of-the-art ImageNet training
> scheme...*
> > > Researchers with Fujitsu Laboratories in Japan have further reduced the
> > > time it takes to train large-scale, supervised learning AI models;
> their
> > > approach lets them train a residual network to around 75% accuracy on
> the
> > > ImageNet dataset after 74.7 seconds of training time. This is a big
> leap
> > > from where we were in 2017 (an hour), and is impressive relative to
> > > late-2018 performance (around 4 minutes: see issue #121
> > > <
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=28edafc07a=0b77acb987
> > > >
> > > ).
> > >
> > > *How they did it: *The researchers trained their system across *2,048
> > Tesla
> > > V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
> > > used a large mini-batch size of 81,920, and also implemented layer-wise
> > > adaptive scaling (LARS) and a 'warming up' period to increase learning
> > > efficiency.
> > >
> > > *Why it matters:* Training large models on distributed infrastructure
> is
> > a
> > > key component of modern AI research, and the reduction in time we've
> seen
> > > on ImageNet training is striking - I think this is emblematic of the
> > > industrialization of AI, as people seek to create systematic approaches
> > to
> > > efficiently training models across large amounts of computers. This
> trend
> > > ultimately leads to a speedup in the rate of research reliant on
> > > large-scale experimentation, and can unlock new paths of research.
> > > *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on
> > ImageNet
> > > in 74.7 seconds (Arxiv)
> > > <
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=d2b13c879f=0b77acb987
> > > >
> > > .
> > >
> > > NVIDIA article -
> > >
> > >
> >
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
> > >
> > > Hope that gives further impetus to strive harder!
> > > Have a good week!
> > > Chai
> > >
> > >  --
> > > *Chaitanya Prakash Bapat*
> > > *+1 (973) 953-6299*
> > >
> > > [image: https://www.linkedin.com//in/chaibapat25]
> > > [image:
> > https://www.facebook.com/chaibapat
> > > ]
> > > [image:
> > > https://twitter.com/ChaiBapchya]  > >[image:
> > > https://www.linkedin.com//in/chaibapat25]
> > > 
> > >
> >
>


-- 
*Chaitanya Prakash Bapat*
*+1 (973) 953-6299*

[image: https://www.linkedin.com//in/chaibapat25]
[image: https://www.facebook.com/chaibapat]
[image:
https://twitter.com/ChaiBapchya] [image:
https://www.linkedin.com//in/chaibapat25]



Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-08 Thread Junru Shao
+1 Thanks Marco for sharing this!

It is great to see people agree with this feature and we actually have been
planning for this for a while. We would love to share this plan as soon as
possible.


On Mon, Apr 8, 2019 at 9:42 AM Tianqi Chen  wrote:

> Just to clarify. I am not questioning the usefulness of the separation.
> Just want to highlight the technical challenges here based on our past
> experiences.
>
> Crossing DLL boundaries in C++ can create quite a lot of problems,
> especially some of the dependencies used a different version of the
> compiler, follows static packaging or simply because of the dynamic linking
> difference in windows. These problems could make this direction move less
> appealing compared to focusing effort on other things.
>
> Technically, as a first step, it is possible to make dependencies change
> not change the global header files and via registration so that changing
> certain component won't trigger a global recompile in CMake. This is also a
> required step toward some modularity.
>
> For plugins, solutions that use C ABI can be used for certain plugin
> modules.
>
> Some of the discussion has been tied to what the interface should look
> like. I think we should use different threads for these and puts in more
> thoughts.
>
> Tianqi
>
>
>
> On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
>
> > I think we can make some incremental progress.  My thoughts were along
> the
> > lines of plugins (thinking about what happens with the VLC project).  At
> > process launch time we could gather some information about our execution
> > environment (either through configuration, or by convention looking at
> our
> > folder structure and libraries available).  We could then later load the
> > components we need after understanding if we're using a CUDA backend and
> > what operators or subgraph components we would need.  Advantages would be
> > that we would move a lot of the current conditional compile logic to
> > runtime, and automate a lot of it.  It would also make packaging binaries
> > for targeted environments a little easier.  As an example we could
> compile
> > once, then remove CUDA focused libraries for systems that are going to
> run
> > on CPUs.
> >
> > On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen 
> > wrote:
> >
> > > While I personally like the idea. This can be something that is fairly
> > > technical challenging and I would caution against this idea vs pushing
> > for
> > > good features and just allow runtime configuration.
> > >
> > > The main problem here is due to the C++ ABI. There is no standard c++
> ABI
> > > across compilers, which means resorting to runtime DLL and dynamic
> > loading
> > > brings all sorts of technical problems, especially when multiple
> modules
> > > depend on the same third dependency(CUDA runtime).
> > > There is no good to go solution can be made here, especially given the
> > > explosion of the backend variants and dependencies in C++.
> > > A partial solution could be achieved, through the sole use of C ABI.
> > > Combing this with code generation can result in some simplifications
> and
> > > enable some runtime loadable module. TVM does this, and perhaps MXNet
> > could
> > > reuse some of that component for operator libraries. Similarly, having
> a
> > > customizable operator library that is loadable via C ABI might be
> > possible.
> > >
> > > So to summarize, while I really like the idea of dynamically loadable
> > > modules. My past experience suggests that this will bring a lot of
> > > additional engineering burden and technical debts without significant
> > > benefit. I would suggest starting by supporting something simple like a
> > > plugin module, before moving toward the general direction.
> > >
> > > Tianqi
> > >
> > > On Sun, Apr 7, 2019 at 1:31 PM kellen sunderland <
> > > kellen.sunderl...@gmail.com> wrote:
> > >
> > > > Strongly support the idea of runtime loadable components in MXNet.
> > > There's
> > > > no reason (other than perhaps engineering effort) we can't have a
> > single
> > > > compilation of MXNet that finds dependencies and chooses execution
> > paths
> > > > intelligently (or based on configuration) at runtime.
> > > >
> > > > On Thu, Apr 4, 2019 at 12:29 PM Marco de Abreu <
> marcoab...@apache.org>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I'd like to start a discussion about something that I've noticed
> > being
> > > > > troublesome to maintain in the current version: Backend choices
> being
> > > > made
> > > > > at compile time.
> > > > >
> > > > > Right now, the different backends and accelerators (CPU, cuda, mkl,
> > AWS
> > > > > elastic inference, (future) AMD, openblas,TVM, etc) are all
> scattered
> > > > > across the different layers of MXNet. On one hand, we have compile
> > time
> > > > > flags that decide which backends are being compiled into the
> binary,
> > > > while
> > > > > at the same time choices can be made in the 

Re: assimilation of mshadow into the MXNet codebase

2019-04-08 Thread Pedro Larroy
There's a flag MSHADOW_STAND_ALONE which supports gemm but not all of
them, and looks like an untested codepath. From what I have seen I
don't think we use this from MXNet, hence the need for a BLAS
implementation.


On Sun, Apr 7, 2019 at 6:16 PM Zhao, Patric  wrote:
>
> Agree.
>
> Recently, we (Tao, Shufan, Pengxin) are trying to integrate the Intel MKL 
> math functions into mshadow and MXNet.
> We have to go through two repos and make lots of tradeoff between them.
> If we can move mshadow into MXNet, it will be more flexible to redesign and 
> refactor parts of legacy code.
>
> > -Original Message-
> > From: Sheng Zha [mailto:zhash...@apache.org]
> > Sent: Monday, April 8, 2019 5:48 AM
> > To: d...@mxnet.apache.org
> > Subject: Re: assimilation of mshadow into the MXNet codebase
> >
> > mshadow depends on *a* BLAS library, and there's nothing inherent in
> > mshadow code base that requires OpenBLAS over MKL. The linked issue
> > #11769 seems to be more of a build logic issue.
> >
> > -sz
> >
> > On 2019/04/07 18:56:43, Aaron Markham 
> > wrote:
> > > +1
> > > Reduced complexity. Choice of math library... Hopefully you can just
> > > install MKL and not be forced into mshadow's dependency on OpenBLAS.
> > > This could make Windows setup easier.
> > > Maybe this issue will get fixed: #11769.
> > >
> > > On Sun, Apr 7, 2019, 00:51 Junru Shao  wrote:
> > >
> > > > Does merging mshadow into mxnet bring any actual benefit for
> > > > customers in sense of performance, portability, or anything else?
> > > >
> > > > On Fri, Apr 5, 2019 at 9:38 PM Tianqi Chen
> > > > 
> > > > wrote:
> > > >
> > > > > Technically, mshadow is sufficient for MXNet. Adopting other
> > > > > libraries ( eigen or xtensor) will unnecessarily increase the
> > > > > codebase complexity without any additional gains.
> > > > >
> > > > > Given that mshadow is only used by mxnet. I do support donating it
> > > > > into mxnet codebase.
> > > > > To respect the original mshadow community. I would recommend
> > > > > starting a community RFC In the mshadow github issue for a week,
> > > > > before we start the migrating process.
> > > > > Also, I would recommend a rebase merge just like the case of
> > > > > MXNet.jl
> > > > code
> > > > > base to preserve the contribution history.
> > > > >
> > > > > Tianqi
> > > > >
> > > > >
> > > > > On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
> > > > >  wrote:
> > > > >
> > > > > > Do you have a link to both of these proposals?
> > > > > >
> > > > > > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya
> > > > > > 
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Pedro,
> > > > > > >
> > > > > > > mshadow is mostly used for tensor arithmetic. There have been
> > > > > discussions
> > > > > > > about including it within mxnet. I think it is a good idea.
> > > > > > >
> > > > > > > As a more long term solution using libraries like eigen to
> > > > > > > perform
> > > > > linear
> > > > > > > algebra operations was also suggested by anirudh2290@. I think
> > > > > xtensor(
> > > > > > > https://github.com/QuantStack/xtensor ) can also be a
> > > > > > > candidate
> > > > here.
> > > > > > >
> > > > > > > -
> > > > > > > Anirudh
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > > > > > pedro.larroy.li...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi
> > > > > > > >
> > > > > > > > Some developers have noticed that working in mshadow is
> > > > > > > > cumbersome
> > > > as
> > > > > > > > it's a 3rdparty subrepo.
> > > > > > > >
> > > > > > > > Since mshadow is a bunch of headers which don't have much of
> > > > > > > > independent tests / library functionality, me and other
> > > > > > > > developers believe that it would be good to assimilate this
> > > > > > > > code in the repository for ease of contribution and changes
> > > > > > > > without having to
> > > > go
> > > > > > > > trough contortions to test PRs that modify mshadow.
> > > > > > > >
> > > > > > > > Would anybody oppose this change?
> > > > > > > >
> > > > > > > > Thanks and have a nice weekend.
> > > > > > > >
> > > > > > > > Pedro.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >


Re: MXNet 1.4.1 Release Proposal

2019-04-08 Thread Hagay Lupesko
Awesome - thanks Junru and Sheng!
I have updated the CWiki to reflect you being the release manager and
shepherd.

Junru - I suggest we give the community a week more to add critical fix
proposals, before we set a timeline. Please feel free to drive this
forward, and I'm happy to help as needed.

Thanks everyone,
Hagay

On Thu, Apr 4, 2019 at 2:27 PM Sheng Zha  wrote:

> Thanks Hagay for proposing the release and for Junru to volunteer to drive
> the release. I will help Junru as the committer for this release.
>
> -sz
>
> On Thu, Apr 4, 2019 at 2:18 PM Junru Shao  wrote:
>
> > Hi Hagay,
> >
> > I have some experiences in MXNet development, and would love to volunteer
> > for driving this release.
> >
> > Thank you so much!
> >
> > Best,
> > Junru
> >
> > On Thu, Apr 4, 2019 at 1:51 PM Hagay Lupesko  wrote:
> >
> > > Hello MXNet community,
> > >
> > > As previously discussed in [0
> > > <
> > >
> >
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> > > >],
> > > and per the feedback from Pedro, Kellen and Sheng, I'd like to propose
> > > releasing MXNet 1.4.1.
> > > MXNet 1.4.1 is a patch release on top of 1.4.0 (following semver[1
> > > ]), that includes backwards compatible bug fixes
> -
> > a
> > > couple I am aware of are mem leaks in Scala API, Gluon RNN and
> NDArrays.
> > >
> > > I went ahead and created a draft release page on CWiki [2
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> > > >],
> > > thanks to Yuxi Hu for adding a mem leak fix, and thanks to Andrew
> Ayres,
> > > Qing Lan and Sergey Sokolov for fixing bugs in 1.4.0 - I went ahead and
> > > added your fixes to the list.
> > >
> > > Asking the community to:
> > > (1) Any bug fix or regression you identified and fixed after 1.4.0
> > release?
> > > please add it to the release proposal wiki (or msg me on Slack if you
> > don't
> > > have write access, happy to do it).
> > > (2) Any comments or suggestions on the release wiki? please leave
> > comments
> > > on the wiki or reply to this email.
> > > (3) I am looking for volunteers to drive the release - ideally we'll
> have
> > > two volunteers: a non-committer and a shepherd committer that can also
> > help
> > > with the logistics that require permissions. This is a great way to
> > > contribute to the community and help MXNet!
> > >
> > > I plan to check-in in a few days and finalize the proposal, so timely
> > > response is appreciated.
> > >
> > > Cheers,
> > > Hagay
> > >
> > > [0]
> > >
> > >
> >
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> > > [1] https://semver.org/
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> > >
> >
>


Re: assimilation of mshadow into the MXNet codebase

2019-04-08 Thread Junru Shao
Thanks Kellen for the explanation, +1 for this!

On Sun, Apr 7, 2019 at 6:16 PM Zhao, Patric  wrote:

> Agree.
>
> Recently, we (Tao, Shufan, Pengxin) are trying to integrate the Intel MKL
> math functions into mshadow and MXNet.
> We have to go through two repos and make lots of tradeoff between them.
> If we can move mshadow into MXNet, it will be more flexible to redesign
> and refactor parts of legacy code.
>
> > -Original Message-
> > From: Sheng Zha [mailto:zhash...@apache.org]
> > Sent: Monday, April 8, 2019 5:48 AM
> > To: d...@mxnet.apache.org
> > Subject: Re: assimilation of mshadow into the MXNet codebase
> >
> > mshadow depends on *a* BLAS library, and there's nothing inherent in
> > mshadow code base that requires OpenBLAS over MKL. The linked issue
> > #11769 seems to be more of a build logic issue.
> >
> > -sz
> >
> > On 2019/04/07 18:56:43, Aaron Markham 
> > wrote:
> > > +1
> > > Reduced complexity. Choice of math library... Hopefully you can just
> > > install MKL and not be forced into mshadow's dependency on OpenBLAS.
> > > This could make Windows setup easier.
> > > Maybe this issue will get fixed: #11769.
> > >
> > > On Sun, Apr 7, 2019, 00:51 Junru Shao  wrote:
> > >
> > > > Does merging mshadow into mxnet bring any actual benefit for
> > > > customers in sense of performance, portability, or anything else?
> > > >
> > > > On Fri, Apr 5, 2019 at 9:38 PM Tianqi Chen
> > > > 
> > > > wrote:
> > > >
> > > > > Technically, mshadow is sufficient for MXNet. Adopting other
> > > > > libraries ( eigen or xtensor) will unnecessarily increase the
> > > > > codebase complexity without any additional gains.
> > > > >
> > > > > Given that mshadow is only used by mxnet. I do support donating it
> > > > > into mxnet codebase.
> > > > > To respect the original mshadow community. I would recommend
> > > > > starting a community RFC In the mshadow github issue for a week,
> > > > > before we start the migrating process.
> > > > > Also, I would recommend a rebase merge just like the case of
> > > > > MXNet.jl
> > > > code
> > > > > base to preserve the contribution history.
> > > > >
> > > > > Tianqi
> > > > >
> > > > >
> > > > > On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
> > > > >  wrote:
> > > > >
> > > > > > Do you have a link to both of these proposals?
> > > > > >
> > > > > > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya
> > > > > > 
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Pedro,
> > > > > > >
> > > > > > > mshadow is mostly used for tensor arithmetic. There have been
> > > > > discussions
> > > > > > > about including it within mxnet. I think it is a good idea.
> > > > > > >
> > > > > > > As a more long term solution using libraries like eigen to
> > > > > > > perform
> > > > > linear
> > > > > > > algebra operations was also suggested by anirudh2290@. I think
> > > > > xtensor(
> > > > > > > https://github.com/QuantStack/xtensor ) can also be a
> > > > > > > candidate
> > > > here.
> > > > > > >
> > > > > > > -
> > > > > > > Anirudh
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > > > > > pedro.larroy.li...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi
> > > > > > > >
> > > > > > > > Some developers have noticed that working in mshadow is
> > > > > > > > cumbersome
> > > > as
> > > > > > > > it's a 3rdparty subrepo.
> > > > > > > >
> > > > > > > > Since mshadow is a bunch of headers which don't have much of
> > > > > > > > independent tests / library functionality, me and other
> > > > > > > > developers believe that it would be good to assimilate this
> > > > > > > > code in the repository for ease of contribution and changes
> > > > > > > > without having to
> > > > go
> > > > > > > > trough contortions to test PRs that modify mshadow.
> > > > > > > >
> > > > > > > > Would anybody oppose this change?
> > > > > > > >
> > > > > > > > Thanks and have a nice weekend.
> > > > > > > >
> > > > > > > > Pedro.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>


Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

2019-04-08 Thread Hagay Lupesko
That's super cool Chai - thanks for sharing!
I also noticed that, and was seeing how we can reach out to the Fujitsu
guys so they can contribute back into MXNet...

On Mon, Apr 8, 2019 at 10:14 AM Lin Yuan  wrote:

> Chai,
>
> Thanks for sharing. This is awesome news!
>
> Lin
>
> On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat 
> wrote:
>
> > Greetings!
> >
> > Great start to a Monday morning, as I came across this news on Import AI,
> > an AI newsletter.
> >
> > The newsletter talked about Apache MXNet, hence thought of sharing it
> with
> > our community. This seems to be a great achievement worth paying
> attention
> > to.
> >
> > *75 seconds: How long it takes to train a network against ImageNet:*
> > *...Fujitsu Research claims state-of-the-art ImageNet training scheme...*
> > Researchers with Fujitsu Laboratories in Japan have further reduced the
> > time it takes to train large-scale, supervised learning AI models; their
> > approach lets them train a residual network to around 75% accuracy on the
> > ImageNet dataset after 74.7 seconds of training time. This is a big leap
> > from where we were in 2017 (an hour), and is impressive relative to
> > late-2018 performance (around 4 minutes: see issue #121
> > <
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=28edafc07a=0b77acb987
> > >
> > ).
> >
> > *How they did it: *The researchers trained their system across *2,048
> Tesla
> > V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
> > used a large mini-batch size of 81,920, and also implemented layer-wise
> > adaptive scaling (LARS) and a 'warming up' period to increase learning
> > efficiency.
> >
> > *Why it matters:* Training large models on distributed infrastructure is
> a
> > key component of modern AI research, and the reduction in time we've seen
> > on ImageNet training is striking - I think this is emblematic of the
> > industrialization of AI, as people seek to create systematic approaches
> to
> > efficiently training models across large amounts of computers. This trend
> > ultimately leads to a speedup in the rate of research reliant on
> > large-scale experimentation, and can unlock new paths of research.
> > *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on
> ImageNet
> > in 74.7 seconds (Arxiv)
> > <
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=d2b13c879f=0b77acb987
> > >
> > .
> >
> > NVIDIA article -
> >
> >
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
> >
> > Hope that gives further impetus to strive harder!
> > Have a good week!
> > Chai
> >
> >  --
> > *Chaitanya Prakash Bapat*
> > *+1 (973) 953-6299*
> >
> > [image: https://www.linkedin.com//in/chaibapat25]
> > [image:
> https://www.facebook.com/chaibapat
> > ]
> > [image:
> > https://twitter.com/ChaiBapchya]  >[image:
> > https://www.linkedin.com//in/chaibapat25]
> > 
> >
>


Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

2019-04-08 Thread Lin Yuan
Chai,

Thanks for sharing. This is awesome news!

Lin

On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat  wrote:

> Greetings!
>
> Great start to a Monday morning, as I came across this news on Import AI,
> an AI newsletter.
>
> The newsletter talked about Apache MXNet, hence thought of sharing it with
> our community. This seems to be a great achievement worth paying attention
> to.
>
> *75 seconds: How long it takes to train a network against ImageNet:*
> *...Fujitsu Research claims state-of-the-art ImageNet training scheme...*
> Researchers with Fujitsu Laboratories in Japan have further reduced the
> time it takes to train large-scale, supervised learning AI models; their
> approach lets them train a residual network to around 75% accuracy on the
> ImageNet dataset after 74.7 seconds of training time. This is a big leap
> from where we were in 2017 (an hour), and is impressive relative to
> late-2018 performance (around 4 minutes: see issue #121
> <
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=28edafc07a=0b77acb987
> >
> ).
>
> *How they did it: *The researchers trained their system across *2,048 Tesla
> V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
> used a large mini-batch size of 81,920, and also implemented layer-wise
> adaptive scaling (LARS) and a 'warming up' period to increase learning
> efficiency.
>
> *Why it matters:* Training large models on distributed infrastructure is a
> key component of modern AI research, and the reduction in time we've seen
> on ImageNet training is striking - I think this is emblematic of the
> industrialization of AI, as people seek to create systematic approaches to
> efficiently training models across large amounts of computers. This trend
> ultimately leads to a speedup in the rate of research reliant on
> large-scale experimentation, and can unlock new paths of research.
> *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on ImageNet
> in 74.7 seconds (Arxiv)
> <
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5=d2b13c879f=0b77acb987
> >
> .
>
> NVIDIA article -
>
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
>
> Hope that gives further impetus to strive harder!
> Have a good week!
> Chai
>
>  --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> [image: https://www.facebook.com/chaibapat
> ]
> [image:
> https://twitter.com/ChaiBapchya] [image:
> https://www.linkedin.com//in/chaibapat25]
> 
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-08 Thread Tianqi Chen
Just to clarify. I am not questioning the usefulness of the separation.
Just want to highlight the technical challenges here based on our past
experiences.

Crossing DLL boundaries in C++ can create quite a lot of problems,
especially some of the dependencies used a different version of the
compiler, follows static packaging or simply because of the dynamic linking
difference in windows. These problems could make this direction move less
appealing compared to focusing effort on other things.

Technically, as a first step, it is possible to make dependencies change
not change the global header files and via registration so that changing
certain component won't trigger a global recompile in CMake. This is also a
required step toward some modularity.

For plugins, solutions that use C ABI can be used for certain plugin
modules.

Some of the discussion has been tied to what the interface should look
like. I think we should use different threads for these and puts in more
thoughts.

Tianqi



On Sun, Apr 7, 2019 at 4:39 PM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> I think we can make some incremental progress.  My thoughts were along the
> lines of plugins (thinking about what happens with the VLC project).  At
> process launch time we could gather some information about our execution
> environment (either through configuration, or by convention looking at our
> folder structure and libraries available).  We could then later load the
> components we need after understanding if we're using a CUDA backend and
> what operators or subgraph components we would need.  Advantages would be
> that we would move a lot of the current conditional compile logic to
> runtime, and automate a lot of it.  It would also make packaging binaries
> for targeted environments a little easier.  As an example we could compile
> once, then remove CUDA focused libraries for systems that are going to run
> on CPUs.
>
> On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen 
> wrote:
>
> > While I personally like the idea. This can be something that is fairly
> > technical challenging and I would caution against this idea vs pushing
> for
> > good features and just allow runtime configuration.
> >
> > The main problem here is due to the C++ ABI. There is no standard c++ ABI
> > across compilers, which means resorting to runtime DLL and dynamic
> loading
> > brings all sorts of technical problems, especially when multiple modules
> > depend on the same third dependency(CUDA runtime).
> > There is no good to go solution can be made here, especially given the
> > explosion of the backend variants and dependencies in C++.
> > A partial solution could be achieved, through the sole use of C ABI.
> > Combing this with code generation can result in some simplifications and
> > enable some runtime loadable module. TVM does this, and perhaps MXNet
> could
> > reuse some of that component for operator libraries. Similarly, having a
> > customizable operator library that is loadable via C ABI might be
> possible.
> >
> > So to summarize, while I really like the idea of dynamically loadable
> > modules. My past experience suggests that this will bring a lot of
> > additional engineering burden and technical debts without significant
> > benefit. I would suggest starting by supporting something simple like a
> > plugin module, before moving toward the general direction.
> >
> > Tianqi
> >
> > On Sun, Apr 7, 2019 at 1:31 PM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > Strongly support the idea of runtime loadable components in MXNet.
> > There's
> > > no reason (other than perhaps engineering effort) we can't have a
> single
> > > compilation of MXNet that finds dependencies and chooses execution
> paths
> > > intelligently (or based on configuration) at runtime.
> > >
> > > On Thu, Apr 4, 2019 at 12:29 PM Marco de Abreu 
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I'd like to start a discussion about something that I've noticed
> being
> > > > troublesome to maintain in the current version: Backend choices being
> > > made
> > > > at compile time.
> > > >
> > > > Right now, the different backends and accelerators (CPU, cuda, mkl,
> AWS
> > > > elastic inference, (future) AMD, openblas,TVM, etc) are all scattered
> > > > across the different layers of MXNet. On one hand, we have compile
> time
> > > > flags that decide which backends are being compiled into the binary,
> > > while
> > > > at the same time choices can be made in the frontend during runtime.
> > > >
> > > > At the moment, we have a lot of conditional build logic that picks
> > > > different parts. With the addition of MKLML and later MKLDNN the
> clear
> > > > separation of CPU and GPU got kind of broken up. While we have some
> > > places
> > > > where each code lives, in the end we resort to some files containing
> a
> > > lot
> > > > of conditional logic for the different backends (sorry I can't
> provide
> > > > links right now since I'm on mobile). 

RE: assimilation of mshadow into the MXNet codebase

2019-04-07 Thread Zhao, Patric
Agree.

Recently, we (Tao, Shufan, Pengxin) are trying to integrate the Intel MKL math 
functions into mshadow and MXNet. 
We have to go through two repos and make lots of tradeoff between them. 
If we can move mshadow into MXNet, it will be more flexible to redesign and 
refactor parts of legacy code.

> -Original Message-
> From: Sheng Zha [mailto:zhash...@apache.org]
> Sent: Monday, April 8, 2019 5:48 AM
> To: d...@mxnet.apache.org
> Subject: Re: assimilation of mshadow into the MXNet codebase
> 
> mshadow depends on *a* BLAS library, and there's nothing inherent in
> mshadow code base that requires OpenBLAS over MKL. The linked issue
> #11769 seems to be more of a build logic issue.
> 
> -sz
> 
> On 2019/04/07 18:56:43, Aaron Markham 
> wrote:
> > +1
> > Reduced complexity. Choice of math library... Hopefully you can just
> > install MKL and not be forced into mshadow's dependency on OpenBLAS.
> > This could make Windows setup easier.
> > Maybe this issue will get fixed: #11769.
> >
> > On Sun, Apr 7, 2019, 00:51 Junru Shao  wrote:
> >
> > > Does merging mshadow into mxnet bring any actual benefit for
> > > customers in sense of performance, portability, or anything else?
> > >
> > > On Fri, Apr 5, 2019 at 9:38 PM Tianqi Chen
> > > 
> > > wrote:
> > >
> > > > Technically, mshadow is sufficient for MXNet. Adopting other
> > > > libraries ( eigen or xtensor) will unnecessarily increase the
> > > > codebase complexity without any additional gains.
> > > >
> > > > Given that mshadow is only used by mxnet. I do support donating it
> > > > into mxnet codebase.
> > > > To respect the original mshadow community. I would recommend
> > > > starting a community RFC In the mshadow github issue for a week,
> > > > before we start the migrating process.
> > > > Also, I would recommend a rebase merge just like the case of
> > > > MXNet.jl
> > > code
> > > > base to preserve the contribution history.
> > > >
> > > > Tianqi
> > > >
> > > >
> > > > On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
> > > >  wrote:
> > > >
> > > > > Do you have a link to both of these proposals?
> > > > >
> > > > > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya
> > > > > 
> > > > > wrote:
> > > > >
> > > > > > Hi Pedro,
> > > > > >
> > > > > > mshadow is mostly used for tensor arithmetic. There have been
> > > > discussions
> > > > > > about including it within mxnet. I think it is a good idea.
> > > > > >
> > > > > > As a more long term solution using libraries like eigen to
> > > > > > perform
> > > > linear
> > > > > > algebra operations was also suggested by anirudh2290@. I think
> > > > xtensor(
> > > > > > https://github.com/QuantStack/xtensor ) can also be a
> > > > > > candidate
> > > here.
> > > > > >
> > > > > > -
> > > > > > Anirudh
> > > > > >
> > > > > >
> > > > > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > > > > pedro.larroy.li...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > Some developers have noticed that working in mshadow is
> > > > > > > cumbersome
> > > as
> > > > > > > it's a 3rdparty subrepo.
> > > > > > >
> > > > > > > Since mshadow is a bunch of headers which don't have much of
> > > > > > > independent tests / library functionality, me and other
> > > > > > > developers believe that it would be good to assimilate this
> > > > > > > code in the repository for ease of contribution and changes
> > > > > > > without having to
> > > go
> > > > > > > trough contortions to test PRs that modify mshadow.
> > > > > > >
> > > > > > > Would anybody oppose this change?
> > > > > > >
> > > > > > > Thanks and have a nice weekend.
> > > > > > >
> > > > > > > Pedro.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >


Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-07 Thread kellen sunderland
I think we can make some incremental progress.  My thoughts were along the
lines of plugins (thinking about what happens with the VLC project).  At
process launch time we could gather some information about our execution
environment (either through configuration, or by convention looking at our
folder structure and libraries available).  We could then later load the
components we need after understanding if we're using a CUDA backend and
what operators or subgraph components we would need.  Advantages would be
that we would move a lot of the current conditional compile logic to
runtime, and automate a lot of it.  It would also make packaging binaries
for targeted environments a little easier.  As an example we could compile
once, then remove CUDA focused libraries for systems that are going to run
on CPUs.

On Sun, Apr 7, 2019 at 2:45 PM Tianqi Chen  wrote:

> While I personally like the idea. This can be something that is fairly
> technical challenging and I would caution against this idea vs pushing for
> good features and just allow runtime configuration.
>
> The main problem here is due to the C++ ABI. There is no standard c++ ABI
> across compilers, which means resorting to runtime DLL and dynamic loading
> brings all sorts of technical problems, especially when multiple modules
> depend on the same third dependency(CUDA runtime).
> There is no good to go solution can be made here, especially given the
> explosion of the backend variants and dependencies in C++.
> A partial solution could be achieved, through the sole use of C ABI.
> Combing this with code generation can result in some simplifications and
> enable some runtime loadable module. TVM does this, and perhaps MXNet could
> reuse some of that component for operator libraries. Similarly, having a
> customizable operator library that is loadable via C ABI might be possible.
>
> So to summarize, while I really like the idea of dynamically loadable
> modules. My past experience suggests that this will bring a lot of
> additional engineering burden and technical debts without significant
> benefit. I would suggest starting by supporting something simple like a
> plugin module, before moving toward the general direction.
>
> Tianqi
>
> On Sun, Apr 7, 2019 at 1:31 PM kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
>
> > Strongly support the idea of runtime loadable components in MXNet.
> There's
> > no reason (other than perhaps engineering effort) we can't have a single
> > compilation of MXNet that finds dependencies and chooses execution paths
> > intelligently (or based on configuration) at runtime.
> >
> > On Thu, Apr 4, 2019 at 12:29 PM Marco de Abreu 
> > wrote:
> >
> > > Hello,
> > >
> > > I'd like to start a discussion about something that I've noticed being
> > > troublesome to maintain in the current version: Backend choices being
> > made
> > > at compile time.
> > >
> > > Right now, the different backends and accelerators (CPU, cuda, mkl, AWS
> > > elastic inference, (future) AMD, openblas,TVM, etc) are all scattered
> > > across the different layers of MXNet. On one hand, we have compile time
> > > flags that decide which backends are being compiled into the binary,
> > while
> > > at the same time choices can be made in the frontend during runtime.
> > >
> > > At the moment, we have a lot of conditional build logic that picks
> > > different parts. With the addition of MKLML and later MKLDNN the clear
> > > separation of CPU and GPU got kind of broken up. While we have some
> > places
> > > where each code lives, in the end we resort to some files containing a
> > lot
> > > of conditional logic for the different backends (sorry I can't provide
> > > links right now since I'm on mobile). To me this seems like a residue
> of
> > > the fast development style from the early days (more processor
> statement
> > > and less object orientation) while also having organic growth with new
> > > accelerators. When I see how much AMD had to hack to fit in their
> > > implementation, it seemed like we have to make this part more developer
> > > friendly.
> > >
> > > At the moment, every new flavour of MXNet has to be entirely
> recompiled.
> > > This makes it hard for users to figure out which options to use, while
> it
> > > makes it harder for us to test since the overhead to test every single
> > > combination of compile parameters would be overwhelming.
> > >
> > > I'd propose to have a clear class hierarchy based structure for
> > > accelerators, operators and memory management. This structure can then
> be
> > > implemented by the different backends. To reduce the compile burden, we
> > > would introduce dynamic loading and split the different backends into
> > > modules. These could then be developed, maintained and compiled on
> their
> > > own and then placed in a "module" folder to be loaded at runtime.
> Adding
> > a
> > > new accelerator would be a matter of placing the precompiled binary
> into
> > > the folder. The 

Re: assimilation of mshadow into the MXNet codebase

2019-04-07 Thread Sheng Zha
mshadow depends on *a* BLAS library, and there's nothing inherent in mshadow 
code base that requires OpenBLAS over MKL. The linked issue #11769 seems to be 
more of a build logic issue.

-sz

On 2019/04/07 18:56:43, Aaron Markham  wrote: 
> +1
> Reduced complexity. Choice of math library... Hopefully you can just
> install MKL and not be forced into mshadow's dependency on OpenBLAS. This
> could make Windows setup easier.
> Maybe this issue will get fixed: #11769.
> 
> On Sun, Apr 7, 2019, 00:51 Junru Shao  wrote:
> 
> > Does merging mshadow into mxnet bring any actual benefit for customers in
> > sense of performance, portability, or anything else?
> >
> > On Fri, Apr 5, 2019 at 9:38 PM Tianqi Chen 
> > wrote:
> >
> > > Technically, mshadow is sufficient for MXNet. Adopting other libraries (
> > > eigen or xtensor) will unnecessarily increase the codebase complexity
> > > without any additional gains.
> > >
> > > Given that mshadow is only used by mxnet. I do support donating it into
> > > mxnet codebase.
> > > To respect the original mshadow community. I would recommend starting a
> > > community RFC In the mshadow github issue for a week, before we start the
> > > migrating process.
> > > Also, I would recommend a rebase merge just like the case of MXNet.jl
> > code
> > > base to preserve the contribution history.
> > >
> > > Tianqi
> > >
> > >
> > > On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
> > >  wrote:
> > >
> > > > Do you have a link to both of these proposals?
> > > >
> > > > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya 
> > > > wrote:
> > > >
> > > > > Hi Pedro,
> > > > >
> > > > > mshadow is mostly used for tensor arithmetic. There have been
> > > discussions
> > > > > about including it within mxnet. I think it is a good idea.
> > > > >
> > > > > As a more long term solution using libraries like eigen to perform
> > > linear
> > > > > algebra operations was also suggested by anirudh2290@. I think
> > > xtensor(
> > > > > https://github.com/QuantStack/xtensor ) can also be a candidate
> > here.
> > > > >
> > > > > -
> > > > > Anirudh
> > > > >
> > > > >
> > > > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > > > pedro.larroy.li...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > Some developers have noticed that working in mshadow is cumbersome
> > as
> > > > > > it's a 3rdparty subrepo.
> > > > > >
> > > > > > Since mshadow is a bunch of headers which don't have much of
> > > > > > independent tests / library functionality, me and other developers
> > > > > > believe that it would be good to assimilate this code in the
> > > > > > repository for ease of contribution and changes without having to
> > go
> > > > > > trough contortions to test PRs that modify mshadow.
> > > > > >
> > > > > > Would anybody oppose this change?
> > > > > >
> > > > > > Thanks and have a nice weekend.
> > > > > >
> > > > > > Pedro.
> > > > > >
> > > > >
> > > >
> > >
> >
> 


Re: assimilation of mshadow into the MXNet codebase

2019-04-07 Thread Sheng Zha
I agree it would make development easier to donate mshadow to mxnet code base, 
since mshadow is only used in MXNet. I support donating the mshadow code to 
mxnet and I started an RFC for this in mshadow [1].

[1] https://github.com/dmlc/mshadow/issues/373

-sz

On 2019/04/06 04:38:19, Tianqi Chen  wrote: 
> Technically, mshadow is sufficient for MXNet. Adopting other libraries (
> eigen or xtensor) will unnecessarily increase the codebase complexity
> without any additional gains.
> 
> Given that mshadow is only used by mxnet. I do support donating it into
> mxnet codebase.
> To respect the original mshadow community. I would recommend starting a
> community RFC In the mshadow github issue for a week, before we start the
> migrating process.
> Also, I would recommend a rebase merge just like the case of MXNet.jl code
> base to preserve the contribution history.
> 
> Tianqi
> 
> 
> On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
>  wrote:
> 
> > Do you have a link to both of these proposals?
> >
> > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya 
> > wrote:
> >
> > > Hi Pedro,
> > >
> > > mshadow is mostly used for tensor arithmetic. There have been discussions
> > > about including it within mxnet. I think it is a good idea.
> > >
> > > As a more long term solution using libraries like eigen to perform linear
> > > algebra operations was also suggested by anirudh2290@. I think xtensor(
> > > https://github.com/QuantStack/xtensor ) can also be a candidate here.
> > >
> > > -
> > > Anirudh
> > >
> > >
> > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > pedro.larroy.li...@gmail.com>
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > Some developers have noticed that working in mshadow is cumbersome as
> > > > it's a 3rdparty subrepo.
> > > >
> > > > Since mshadow is a bunch of headers which don't have much of
> > > > independent tests / library functionality, me and other developers
> > > > believe that it would be good to assimilate this code in the
> > > > repository for ease of contribution and changes without having to go
> > > > trough contortions to test PRs that modify mshadow.
> > > >
> > > > Would anybody oppose this change?
> > > >
> > > > Thanks and have a nice weekend.
> > > >
> > > > Pedro.
> > > >
> > >
> >
> 


Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-07 Thread Tianqi Chen
While I personally like the idea. This can be something that is fairly
technical challenging and I would caution against this idea vs pushing for
good features and just allow runtime configuration.

The main problem here is due to the C++ ABI. There is no standard c++ ABI
across compilers, which means resorting to runtime DLL and dynamic loading
brings all sorts of technical problems, especially when multiple modules
depend on the same third dependency(CUDA runtime).
There is no good to go solution can be made here, especially given the
explosion of the backend variants and dependencies in C++.
A partial solution could be achieved, through the sole use of C ABI.
Combing this with code generation can result in some simplifications and
enable some runtime loadable module. TVM does this, and perhaps MXNet could
reuse some of that component for operator libraries. Similarly, having a
customizable operator library that is loadable via C ABI might be possible.

So to summarize, while I really like the idea of dynamically loadable
modules. My past experience suggests that this will bring a lot of
additional engineering burden and technical debts without significant
benefit. I would suggest starting by supporting something simple like a
plugin module, before moving toward the general direction.

Tianqi

On Sun, Apr 7, 2019 at 1:31 PM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> Strongly support the idea of runtime loadable components in MXNet.  There's
> no reason (other than perhaps engineering effort) we can't have a single
> compilation of MXNet that finds dependencies and chooses execution paths
> intelligently (or based on configuration) at runtime.
>
> On Thu, Apr 4, 2019 at 12:29 PM Marco de Abreu 
> wrote:
>
> > Hello,
> >
> > I'd like to start a discussion about something that I've noticed being
> > troublesome to maintain in the current version: Backend choices being
> made
> > at compile time.
> >
> > Right now, the different backends and accelerators (CPU, cuda, mkl, AWS
> > elastic inference, (future) AMD, openblas,TVM, etc) are all scattered
> > across the different layers of MXNet. On one hand, we have compile time
> > flags that decide which backends are being compiled into the binary,
> while
> > at the same time choices can be made in the frontend during runtime.
> >
> > At the moment, we have a lot of conditional build logic that picks
> > different parts. With the addition of MKLML and later MKLDNN the clear
> > separation of CPU and GPU got kind of broken up. While we have some
> places
> > where each code lives, in the end we resort to some files containing a
> lot
> > of conditional logic for the different backends (sorry I can't provide
> > links right now since I'm on mobile). To me this seems like a residue of
> > the fast development style from the early days (more processor statement
> > and less object orientation) while also having organic growth with new
> > accelerators. When I see how much AMD had to hack to fit in their
> > implementation, it seemed like we have to make this part more developer
> > friendly.
> >
> > At the moment, every new flavour of MXNet has to be entirely recompiled.
> > This makes it hard for users to figure out which options to use, while it
> > makes it harder for us to test since the overhead to test every single
> > combination of compile parameters would be overwhelming.
> >
> > I'd propose to have a clear class hierarchy based structure for
> > accelerators, operators and memory management. This structure can then be
> > implemented by the different backends. To reduce the compile burden, we
> > would introduce dynamic loading and split the different backends into
> > modules. These could then be developed, maintained and compiled on their
> > own and then placed in a "module" folder to be loaded at runtime. Adding
> a
> > new accelerator would be a matter of placing the precompiled binary into
> > the folder. The detailed configuration of that Backend would then be done
> > on runtime - the user shouldn't worry at the point of downloading mxnet
> > whether they want mkl, MKLDNN, mkl, openblas, atlas, TVM, cuda or what
> ever
> > else there is. I have an idea how we could help the user choosing, but
> > that's outside the scope of this proposal.
> >
> > This would allow us to have a "core" MXNet that takes care of the engine,
> > scheduling, communication and all the other crucial parts. On the other
> > hand we could make MXNet less of a monolith and have clear interfaces.
> This
> > would also act as a forcing function because the different parts wouldn't
> > be intermingled but have to follow the common interface.
> >
> > Of course this comes with the question what these interfaces would look
> > like. For operators, I'd like to propose getting inspiring (or fully
> > adapting) ONNX. For memory management and other Backend specific things
> we
> > could look at the current implementations and find a common ground.
> >
> > Back when I 

Re: [MXNET 2.0 Wishlist] [DISCUSS] Backend choices during runtime

2019-04-07 Thread kellen sunderland
Strongly support the idea of runtime loadable components in MXNet.  There's
no reason (other than perhaps engineering effort) we can't have a single
compilation of MXNet that finds dependencies and chooses execution paths
intelligently (or based on configuration) at runtime.

On Thu, Apr 4, 2019 at 12:29 PM Marco de Abreu 
wrote:

> Hello,
>
> I'd like to start a discussion about something that I've noticed being
> troublesome to maintain in the current version: Backend choices being made
> at compile time.
>
> Right now, the different backends and accelerators (CPU, cuda, mkl, AWS
> elastic inference, (future) AMD, openblas,TVM, etc) are all scattered
> across the different layers of MXNet. On one hand, we have compile time
> flags that decide which backends are being compiled into the binary, while
> at the same time choices can be made in the frontend during runtime.
>
> At the moment, we have a lot of conditional build logic that picks
> different parts. With the addition of MKLML and later MKLDNN the clear
> separation of CPU and GPU got kind of broken up. While we have some places
> where each code lives, in the end we resort to some files containing a lot
> of conditional logic for the different backends (sorry I can't provide
> links right now since I'm on mobile). To me this seems like a residue of
> the fast development style from the early days (more processor statement
> and less object orientation) while also having organic growth with new
> accelerators. When I see how much AMD had to hack to fit in their
> implementation, it seemed like we have to make this part more developer
> friendly.
>
> At the moment, every new flavour of MXNet has to be entirely recompiled.
> This makes it hard for users to figure out which options to use, while it
> makes it harder for us to test since the overhead to test every single
> combination of compile parameters would be overwhelming.
>
> I'd propose to have a clear class hierarchy based structure for
> accelerators, operators and memory management. This structure can then be
> implemented by the different backends. To reduce the compile burden, we
> would introduce dynamic loading and split the different backends into
> modules. These could then be developed, maintained and compiled on their
> own and then placed in a "module" folder to be loaded at runtime. Adding a
> new accelerator would be a matter of placing the precompiled binary into
> the folder. The detailed configuration of that Backend would then be done
> on runtime - the user shouldn't worry at the point of downloading mxnet
> whether they want mkl, MKLDNN, mkl, openblas, atlas, TVM, cuda or what ever
> else there is. I have an idea how we could help the user choosing, but
> that's outside the scope of this proposal.
>
> This would allow us to have a "core" MXNet that takes care of the engine,
> scheduling, communication and all the other crucial parts. On the other
> hand we could make MXNet less of a monolith and have clear interfaces. This
> would also act as a forcing function because the different parts wouldn't
> be intermingled but have to follow the common interface.
>
> Of course this comes with the question what these interfaces would look
> like. For operators, I'd like to propose getting inspiring (or fully
> adapting) ONNX. For memory management and other Backend specific things we
> could look at the current implementations and find a common ground.
>
> Back when I had a community driven project, we heavily used this modularity
> and it brought great benefits - besides the fact that our core was closed
> source. It allowed community developers to act entirely independent from
> other parts and even allowed them to add their own logic without having to
> touch the core. Thinking about companies that implement their own backends
> or have special tweaked operators without wanting to disclose them, this
> structure would avoid them having to fork the project and then spend a lot
> of effort porting the changes to the latest source release versions.
> Instead, they would maintain their module and we as MXNet community would
> only have to maintain these interfaces.
>
> Right now this is a lot of prosa and basically a brain dump of my thoughts.
> I'd be happy to follow up with details, but first I'd be curious what the
> community thinks about this design.
>
> Best regards,
> Marco
>


Re: assimilation of mshadow into the MXNet codebase

2019-04-07 Thread kellen sunderland
"Does merging mshadow into mxnet bring any actual benefit for customers in
sense of performance, portability, or anything else?"

It would improve the contributor experience in that if we find a bug which
requires fixes in both repos, we won't have to coordinate 2 PRs.  It would
also make compilation more straightforward (as others have mentioned).

On Sun, Apr 7, 2019 at 11:56 AM Aaron Markham 
wrote:

> +1
> Reduced complexity. Choice of math library... Hopefully you can just
> install MKL and not be forced into mshadow's dependency on OpenBLAS. This
> could make Windows setup easier.
> Maybe this issue will get fixed: #11769.
>
> On Sun, Apr 7, 2019, 00:51 Junru Shao  wrote:
>
> > Does merging mshadow into mxnet bring any actual benefit for customers in
> > sense of performance, portability, or anything else?
> >
> > On Fri, Apr 5, 2019 at 9:38 PM Tianqi Chen 
> > wrote:
> >
> > > Technically, mshadow is sufficient for MXNet. Adopting other libraries
> (
> > > eigen or xtensor) will unnecessarily increase the codebase complexity
> > > without any additional gains.
> > >
> > > Given that mshadow is only used by mxnet. I do support donating it into
> > > mxnet codebase.
> > > To respect the original mshadow community. I would recommend starting a
> > > community RFC In the mshadow github issue for a week, before we start
> the
> > > migrating process.
> > > Also, I would recommend a rebase merge just like the case of MXNet.jl
> > code
> > > base to preserve the contribution history.
> > >
> > > Tianqi
> > >
> > >
> > > On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
> > >  wrote:
> > >
> > > > Do you have a link to both of these proposals?
> > > >
> > > > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya 
> > > > wrote:
> > > >
> > > > > Hi Pedro,
> > > > >
> > > > > mshadow is mostly used for tensor arithmetic. There have been
> > > discussions
> > > > > about including it within mxnet. I think it is a good idea.
> > > > >
> > > > > As a more long term solution using libraries like eigen to perform
> > > linear
> > > > > algebra operations was also suggested by anirudh2290@. I think
> > > xtensor(
> > > > > https://github.com/QuantStack/xtensor ) can also be a candidate
> > here.
> > > > >
> > > > > -
> > > > > Anirudh
> > > > >
> > > > >
> > > > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > > > pedro.larroy.li...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > Some developers have noticed that working in mshadow is
> cumbersome
> > as
> > > > > > it's a 3rdparty subrepo.
> > > > > >
> > > > > > Since mshadow is a bunch of headers which don't have much of
> > > > > > independent tests / library functionality, me and other
> developers
> > > > > > believe that it would be good to assimilate this code in the
> > > > > > repository for ease of contribution and changes without having to
> > go
> > > > > > trough contortions to test PRs that modify mshadow.
> > > > > >
> > > > > > Would anybody oppose this change?
> > > > > >
> > > > > > Thanks and have a nice weekend.
> > > > > >
> > > > > > Pedro.
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: assimilation of mshadow into the MXNet codebase

2019-04-07 Thread Aaron Markham
+1
Reduced complexity. Choice of math library... Hopefully you can just
install MKL and not be forced into mshadow's dependency on OpenBLAS. This
could make Windows setup easier.
Maybe this issue will get fixed: #11769.

On Sun, Apr 7, 2019, 00:51 Junru Shao  wrote:

> Does merging mshadow into mxnet bring any actual benefit for customers in
> sense of performance, portability, or anything else?
>
> On Fri, Apr 5, 2019 at 9:38 PM Tianqi Chen 
> wrote:
>
> > Technically, mshadow is sufficient for MXNet. Adopting other libraries (
> > eigen or xtensor) will unnecessarily increase the codebase complexity
> > without any additional gains.
> >
> > Given that mshadow is only used by mxnet. I do support donating it into
> > mxnet codebase.
> > To respect the original mshadow community. I would recommend starting a
> > community RFC In the mshadow github issue for a week, before we start the
> > migrating process.
> > Also, I would recommend a rebase merge just like the case of MXNet.jl
> code
> > base to preserve the contribution history.
> >
> > Tianqi
> >
> >
> > On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
> >  wrote:
> >
> > > Do you have a link to both of these proposals?
> > >
> > > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya 
> > > wrote:
> > >
> > > > Hi Pedro,
> > > >
> > > > mshadow is mostly used for tensor arithmetic. There have been
> > discussions
> > > > about including it within mxnet. I think it is a good idea.
> > > >
> > > > As a more long term solution using libraries like eigen to perform
> > linear
> > > > algebra operations was also suggested by anirudh2290@. I think
> > xtensor(
> > > > https://github.com/QuantStack/xtensor ) can also be a candidate
> here.
> > > >
> > > > -
> > > > Anirudh
> > > >
> > > >
> > > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > > pedro.larroy.li...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > Some developers have noticed that working in mshadow is cumbersome
> as
> > > > > it's a 3rdparty subrepo.
> > > > >
> > > > > Since mshadow is a bunch of headers which don't have much of
> > > > > independent tests / library functionality, me and other developers
> > > > > believe that it would be good to assimilate this code in the
> > > > > repository for ease of contribution and changes without having to
> go
> > > > > trough contortions to test PRs that modify mshadow.
> > > > >
> > > > > Would anybody oppose this change?
> > > > >
> > > > > Thanks and have a nice weekend.
> > > > >
> > > > > Pedro.
> > > > >
> > > >
> > >
> >
>


Re: assimilation of mshadow into the MXNet codebase

2019-04-07 Thread Junru Shao
Does merging mshadow into mxnet bring any actual benefit for customers in
sense of performance, portability, or anything else?

On Fri, Apr 5, 2019 at 9:38 PM Tianqi Chen  wrote:

> Technically, mshadow is sufficient for MXNet. Adopting other libraries (
> eigen or xtensor) will unnecessarily increase the codebase complexity
> without any additional gains.
>
> Given that mshadow is only used by mxnet. I do support donating it into
> mxnet codebase.
> To respect the original mshadow community. I would recommend starting a
> community RFC In the mshadow github issue for a week, before we start the
> migrating process.
> Also, I would recommend a rebase merge just like the case of MXNet.jl code
> base to preserve the contribution history.
>
> Tianqi
>
>
> On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
>  wrote:
>
> > Do you have a link to both of these proposals?
> >
> > On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya 
> > wrote:
> >
> > > Hi Pedro,
> > >
> > > mshadow is mostly used for tensor arithmetic. There have been
> discussions
> > > about including it within mxnet. I think it is a good idea.
> > >
> > > As a more long term solution using libraries like eigen to perform
> linear
> > > algebra operations was also suggested by anirudh2290@. I think
> xtensor(
> > > https://github.com/QuantStack/xtensor ) can also be a candidate here.
> > >
> > > -
> > > Anirudh
> > >
> > >
> > > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> > pedro.larroy.li...@gmail.com>
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > Some developers have noticed that working in mshadow is cumbersome as
> > > > it's a 3rdparty subrepo.
> > > >
> > > > Since mshadow is a bunch of headers which don't have much of
> > > > independent tests / library functionality, me and other developers
> > > > believe that it would be good to assimilate this code in the
> > > > repository for ease of contribution and changes without having to go
> > > > trough contortions to test PRs that modify mshadow.
> > > >
> > > > Would anybody oppose this change?
> > > >
> > > > Thanks and have a nice weekend.
> > > >
> > > > Pedro.
> > > >
> > >
> >
>


Re: assimilation of mshadow into the MXNet codebase

2019-04-05 Thread Tianqi Chen
Technically, mshadow is sufficient for MXNet. Adopting other libraries (
eigen or xtensor) will unnecessarily increase the codebase complexity
without any additional gains.

Given that mshadow is only used by mxnet. I do support donating it into
mxnet codebase.
To respect the original mshadow community. I would recommend starting a
community RFC In the mshadow github issue for a week, before we start the
migrating process.
Also, I would recommend a rebase merge just like the case of MXNet.jl code
base to preserve the contribution history.

Tianqi


On Fri, Apr 5, 2019 at 9:25 PM Alfredo Luque
 wrote:

> Do you have a link to both of these proposals?
>
> On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya 
> wrote:
>
> > Hi Pedro,
> >
> > mshadow is mostly used for tensor arithmetic. There have been discussions
> > about including it within mxnet. I think it is a good idea.
> >
> > As a more long term solution using libraries like eigen to perform linear
> > algebra operations was also suggested by anirudh2290@. I think xtensor(
> > https://github.com/QuantStack/xtensor ) can also be a candidate here.
> >
> > -
> > Anirudh
> >
> >
> > On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > Some developers have noticed that working in mshadow is cumbersome as
> > > it's a 3rdparty subrepo.
> > >
> > > Since mshadow is a bunch of headers which don't have much of
> > > independent tests / library functionality, me and other developers
> > > believe that it would be good to assimilate this code in the
> > > repository for ease of contribution and changes without having to go
> > > trough contortions to test PRs that modify mshadow.
> > >
> > > Would anybody oppose this change?
> > >
> > > Thanks and have a nice weekend.
> > >
> > > Pedro.
> > >
> >
>


Re: assimilation of mshadow into the MXNet codebase

2019-04-05 Thread Alfredo Luque
Do you have a link to both of these proposals?

On Fri, Apr 5, 2019 at 20:14 Anirudh Acharya  wrote:

> Hi Pedro,
>
> mshadow is mostly used for tensor arithmetic. There have been discussions
> about including it within mxnet. I think it is a good idea.
>
> As a more long term solution using libraries like eigen to perform linear
> algebra operations was also suggested by anirudh2290@. I think xtensor(
> https://github.com/QuantStack/xtensor ) can also be a candidate here.
>
> -
> Anirudh
>
>
> On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy 
> wrote:
>
> > Hi
> >
> > Some developers have noticed that working in mshadow is cumbersome as
> > it's a 3rdparty subrepo.
> >
> > Since mshadow is a bunch of headers which don't have much of
> > independent tests / library functionality, me and other developers
> > believe that it would be good to assimilate this code in the
> > repository for ease of contribution and changes without having to go
> > trough contortions to test PRs that modify mshadow.
> >
> > Would anybody oppose this change?
> >
> > Thanks and have a nice weekend.
> >
> > Pedro.
> >
>


Re: assimilation of mshadow into the MXNet codebase

2019-04-05 Thread Anirudh Acharya
Hi Pedro,

mshadow is mostly used for tensor arithmetic. There have been discussions
about including it within mxnet. I think it is a good idea.

As a more long term solution using libraries like eigen to perform linear
algebra operations was also suggested by anirudh2290@. I think xtensor(
https://github.com/QuantStack/xtensor ) can also be a candidate here.

-
Anirudh


On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy 
wrote:

> Hi
>
> Some developers have noticed that working in mshadow is cumbersome as
> it's a 3rdparty subrepo.
>
> Since mshadow is a bunch of headers which don't have much of
> independent tests / library functionality, me and other developers
> believe that it would be good to assimilate this code in the
> repository for ease of contribution and changes without having to go
> trough contortions to test PRs that modify mshadow.
>
> Would anybody oppose this change?
>
> Thanks and have a nice weekend.
>
> Pedro.
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system

2019-04-05 Thread Junru Shao
I agree with Tianqi and Marco. Probably what should happen is to let cmake
be the default in some minor release, and completely deprecate makefiles in
2.0.

On Fri, Apr 5, 2019 at 10:23 AM Marco de Abreu 
wrote:

> I think this is rather about the depreciation of the make based build
> system. We currently have make and cmake in parallel but with diverging
> feature support.
>
> -Marco
>
> Tianqi Chen  schrieb am Fr., 5. Apr. 2019,
> 11:42:
>
> > I am in favor of using CMake. And I personally think CMake is not
> something
> > that has to be introduced in a 2.0. It can simply be part of a minor
> > release.
> >
> > Tianqi
> >
> > On Thu, Apr 4, 2019 at 10:31 AM Kellen Sunderland 
> > wrote:
> >
> > > Hello MXNet devs,
> > >
> > > I'd like to start a thread discussing what our build system should look
> > > like in MXNet 2.0.  I'd propose that although the current make system
> has
> > > served us well in the past, we remove it along with the bump to 2.0.
> The
> > > end goal I'd like to see is that we have a clean build system, without
> a
> > > bunch of conditional logic that makes contributing and testing MXNet a
> > > simpler process.  Additionally I'd propose we target a minimum cmake
> > > version of 3.7 for reasons described below.
> > >
> > > First I'd like to give some context on why I'd propose we don't just
> > switch
> > > to cmake, but we also target a relatively new version (version 3.7 from
> > > Nov, 2016) of cmake.  The largest benefits in making this change would
> > > apply to CUDA builds where cmake itself has quite inconsistent
> > > functionality between versions.  One persistent annoyance I've had with
> > > cmake is that we've had conditional logic for the FindCUDA command
> which
> > at
> > > one point targeted some modern cmake features, but then in subsequent
> > > versions of cmake the way these features works was tweaked, and now I
> > find
> > > these cmake features are consistently broken to the point that I
> require
> > a
> > > bunch of -D defines to compile properly or to use an IDE.  An
> additional
> > > CUDA related issue is that every time there's a new SM added to NVCC we
> > > have to make a few source changes to support it.  I could see this
> being
> > > problematic for users who may suddenly realize that due to their
> > > compilation settings, they may not actually be enabling the features
> they
> > > think they are with their shiny new GPUs.
> > >
> > > As an alternative if we, for example, target cmake 3.7 at a minimum,
> and
> > we
> > > want to find cuda and then build a list of reasonable PTX/BINS we could
> > use
> > > the following command[1]:
> > >
> > > 
> > > FindCUDA(...)
> > > ...
> > > CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
> > >   LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
> > > 
> > >
> > > Simple, concise, and it would help to make the building experience more
> > > consistent across platforms, build environments and IDEs (looking at
> you
> > > CLion).  We'd of course need to do a little experimentation work to
> make
> > > sure that this does indeed work as intended, and can replace the
> > currently
> > > complex findCuda logic we have in our build systems, but for the sake
> of
> > > the proposal let's assume these cmake commands do indeed work
> > consistently
> > > as documented from cmake 3.7 onwards.
> > >
> > > To give users a chance to update their tooling I'd also suggest we
> begin
> > > warning users at least a release in advance that make based builds will
> > be
> > > deprecated in MXNet 2.0 so they can begin migrating to cmake.  I'd also
> > > want to display deprecation messages for unused cmake flags (such as
> the
> > > profiler flag) for a release before the 2.0 release, and then remove
> them
> > > in 2.0.
> > >
> > > Of course not all users have cmake 3.7 on their systems, some of our
> > > employers force use to use ridiculously outdated linux distributions.
> > The
> > > good news for these users is that if we can offer Docker compilation
> with
> > > an image that has a supported version of cmake and we should be able to
> > > build a portable binary that work even with very old distributions of
> > > Linux.  Additionally installing cmake from source is also fairly
> > > straightforward [2] and works quite well on older distros in my
> > experience.
> > >
> > > Looking forward to hearing what others think.  Any preferred build
> > systems
> > > that you all would want to use?  Is cmake the right system to
> centralize
> > > on?  If so, is version 3.7 a reasonable minimum version to target?  Is
> > the
> > > 2.0 release a good point at which we can think about simplifying build
> > > logic?
> > >
> > > 1: https://cmake.org/cmake/help/v3.7/module/FindCUDA.html
> > > 2: https://github.com/Kitware/CMake
> > >
> >
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system

2019-04-05 Thread Marco de Abreu
I think this is rather about the depreciation of the make based build
system. We currently have make and cmake in parallel but with diverging
feature support.

-Marco

Tianqi Chen  schrieb am Fr., 5. Apr. 2019, 11:42:

> I am in favor of using CMake. And I personally think CMake is not something
> that has to be introduced in a 2.0. It can simply be part of a minor
> release.
>
> Tianqi
>
> On Thu, Apr 4, 2019 at 10:31 AM Kellen Sunderland 
> wrote:
>
> > Hello MXNet devs,
> >
> > I'd like to start a thread discussing what our build system should look
> > like in MXNet 2.0.  I'd propose that although the current make system has
> > served us well in the past, we remove it along with the bump to 2.0.  The
> > end goal I'd like to see is that we have a clean build system, without a
> > bunch of conditional logic that makes contributing and testing MXNet a
> > simpler process.  Additionally I'd propose we target a minimum cmake
> > version of 3.7 for reasons described below.
> >
> > First I'd like to give some context on why I'd propose we don't just
> switch
> > to cmake, but we also target a relatively new version (version 3.7 from
> > Nov, 2016) of cmake.  The largest benefits in making this change would
> > apply to CUDA builds where cmake itself has quite inconsistent
> > functionality between versions.  One persistent annoyance I've had with
> > cmake is that we've had conditional logic for the FindCUDA command which
> at
> > one point targeted some modern cmake features, but then in subsequent
> > versions of cmake the way these features works was tweaked, and now I
> find
> > these cmake features are consistently broken to the point that I require
> a
> > bunch of -D defines to compile properly or to use an IDE.  An additional
> > CUDA related issue is that every time there's a new SM added to NVCC we
> > have to make a few source changes to support it.  I could see this being
> > problematic for users who may suddenly realize that due to their
> > compilation settings, they may not actually be enabling the features they
> > think they are with their shiny new GPUs.
> >
> > As an alternative if we, for example, target cmake 3.7 at a minimum, and
> we
> > want to find cuda and then build a list of reasonable PTX/BINS we could
> use
> > the following command[1]:
> >
> > 
> > FindCUDA(...)
> > ...
> > CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
> >   LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
> > 
> >
> > Simple, concise, and it would help to make the building experience more
> > consistent across platforms, build environments and IDEs (looking at you
> > CLion).  We'd of course need to do a little experimentation work to make
> > sure that this does indeed work as intended, and can replace the
> currently
> > complex findCuda logic we have in our build systems, but for the sake of
> > the proposal let's assume these cmake commands do indeed work
> consistently
> > as documented from cmake 3.7 onwards.
> >
> > To give users a chance to update their tooling I'd also suggest we begin
> > warning users at least a release in advance that make based builds will
> be
> > deprecated in MXNet 2.0 so they can begin migrating to cmake.  I'd also
> > want to display deprecation messages for unused cmake flags (such as the
> > profiler flag) for a release before the 2.0 release, and then remove them
> > in 2.0.
> >
> > Of course not all users have cmake 3.7 on their systems, some of our
> > employers force use to use ridiculously outdated linux distributions.
> The
> > good news for these users is that if we can offer Docker compilation with
> > an image that has a supported version of cmake and we should be able to
> > build a portable binary that work even with very old distributions of
> > Linux.  Additionally installing cmake from source is also fairly
> > straightforward [2] and works quite well on older distros in my
> experience.
> >
> > Looking forward to hearing what others think.  Any preferred build
> systems
> > that you all would want to use?  Is cmake the right system to centralize
> > on?  If so, is version 3.7 a reasonable minimum version to target?  Is
> the
> > 2.0 release a good point at which we can think about simplifying build
> > logic?
> >
> > 1: https://cmake.org/cmake/help/v3.7/module/FindCUDA.html
> > 2: https://github.com/Kitware/CMake
> >
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system

2019-04-05 Thread Tianqi Chen
I am in favor of using CMake. And I personally think CMake is not something
that has to be introduced in a 2.0. It can simply be part of a minor
release.

Tianqi

On Thu, Apr 4, 2019 at 10:31 AM Kellen Sunderland  wrote:

> Hello MXNet devs,
>
> I'd like to start a thread discussing what our build system should look
> like in MXNet 2.0.  I'd propose that although the current make system has
> served us well in the past, we remove it along with the bump to 2.0.  The
> end goal I'd like to see is that we have a clean build system, without a
> bunch of conditional logic that makes contributing and testing MXNet a
> simpler process.  Additionally I'd propose we target a minimum cmake
> version of 3.7 for reasons described below.
>
> First I'd like to give some context on why I'd propose we don't just switch
> to cmake, but we also target a relatively new version (version 3.7 from
> Nov, 2016) of cmake.  The largest benefits in making this change would
> apply to CUDA builds where cmake itself has quite inconsistent
> functionality between versions.  One persistent annoyance I've had with
> cmake is that we've had conditional logic for the FindCUDA command which at
> one point targeted some modern cmake features, but then in subsequent
> versions of cmake the way these features works was tweaked, and now I find
> these cmake features are consistently broken to the point that I require a
> bunch of -D defines to compile properly or to use an IDE.  An additional
> CUDA related issue is that every time there's a new SM added to NVCC we
> have to make a few source changes to support it.  I could see this being
> problematic for users who may suddenly realize that due to their
> compilation settings, they may not actually be enabling the features they
> think they are with their shiny new GPUs.
>
> As an alternative if we, for example, target cmake 3.7 at a minimum, and we
> want to find cuda and then build a list of reasonable PTX/BINS we could use
> the following command[1]:
>
> 
> FindCUDA(...)
> ...
> CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
>   LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
> 
>
> Simple, concise, and it would help to make the building experience more
> consistent across platforms, build environments and IDEs (looking at you
> CLion).  We'd of course need to do a little experimentation work to make
> sure that this does indeed work as intended, and can replace the currently
> complex findCuda logic we have in our build systems, but for the sake of
> the proposal let's assume these cmake commands do indeed work consistently
> as documented from cmake 3.7 onwards.
>
> To give users a chance to update their tooling I'd also suggest we begin
> warning users at least a release in advance that make based builds will be
> deprecated in MXNet 2.0 so they can begin migrating to cmake.  I'd also
> want to display deprecation messages for unused cmake flags (such as the
> profiler flag) for a release before the 2.0 release, and then remove them
> in 2.0.
>
> Of course not all users have cmake 3.7 on their systems, some of our
> employers force use to use ridiculously outdated linux distributions.  The
> good news for these users is that if we can offer Docker compilation with
> an image that has a supported version of cmake and we should be able to
> build a portable binary that work even with very old distributions of
> Linux.  Additionally installing cmake from source is also fairly
> straightforward [2] and works quite well on older distros in my experience.
>
> Looking forward to hearing what others think.  Any preferred build systems
> that you all would want to use?  Is cmake the right system to centralize
> on?  If so, is version 3.7 a reasonable minimum version to target?  Is the
> 2.0 release a good point at which we can think about simplifying build
> logic?
>
> 1: https://cmake.org/cmake/help/v3.7/module/FindCUDA.html
> 2: https://github.com/Kitware/CMake
>


RE: [MXNET 2.0 Wishlist] [DISCUSS] Single build system

2019-04-05 Thread Zhao, Patric
+1 single build system.



> -Original Message-
> From: Qing Lan [mailto:lanking...@live.com]
> Sent: Friday, April 5, 2019 5:27 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system
> 
> +1 to have a single build system
> 
> Currently the way of publish and the way of doing CI test is very different.
> The instruction shown on the website should match the way we deliver it to
> the users.
> Having a single build process would simplify the maintainance cost and reach
> to the best performance.
> 
> Thanks,
> Qing
> 
> 
> From: Marco de Abreu 
> Sent: Thursday, April 4, 2019 15:01
> To: dev@mxnet.incubator.apache.org
> Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system
> 
> +1 towards having a single build system
> 
> I'd like to add the benefit of this approach allowing us to have the same 
> build
> logic across all operating systems. It would be great if we could make
> x86/Unix, x86/windows, x86/mac and ARM/Unix first class citizens from the
> beginning.
> 
> -Marco
> 
> Kellen Sunderland  schrieb am Do., 4. Apr. 2019, 12:31:
> 
> > Hello MXNet devs,
> >
> > I'd like to start a thread discussing what our build system should
> > look like in MXNet 2.0.  I'd propose that although the current make
> > system has served us well in the past, we remove it along with the
> > bump to 2.0.  The end goal I'd like to see is that we have a clean
> > build system, without a bunch of conditional logic that makes
> > contributing and testing MXNet a simpler process.  Additionally I'd
> > propose we target a minimum cmake version of 3.7 for reasons described
> below.
> >
> > First I'd like to give some context on why I'd propose we don't just
> > switch to cmake, but we also target a relatively new version (version
> > 3.7 from Nov, 2016) of cmake.  The largest benefits in making this
> > change would apply to CUDA builds where cmake itself has quite
> > inconsistent functionality between versions.  One persistent annoyance
> > I've had with cmake is that we've had conditional logic for the
> > FindCUDA command which at one point targeted some modern cmake
> > features, but then in subsequent versions of cmake the way these
> > features works was tweaked, and now I find these cmake features are
> > consistently broken to the point that I require a bunch of -D defines
> > to compile properly or to use an IDE.  An additional CUDA related
> > issue is that every time there's a new SM added to NVCC we have to
> > make a few source changes to support it.  I could see this being
> > problematic for users who may suddenly realize that due to their
> > compilation settings, they may not actually be enabling the features they
> think they are with their shiny new GPUs.
> >
> > As an alternative if we, for example, target cmake 3.7 at a minimum,
> > and we want to find cuda and then build a list of reasonable PTX/BINS
> > we could use the following command[1]:
> >
> > 
> > FindCUDA(...)
> > ...
> > CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0)
> Maxwell)
> >   LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
> > 
> >
> > Simple, concise, and it would help to make the building experience
> > more consistent across platforms, build environments and IDEs (looking
> > at you CLion).  We'd of course need to do a little experimentation
> > work to make sure that this does indeed work as intended, and can
> > replace the currently complex findCuda logic we have in our build
> > systems, but for the sake of the proposal let's assume these cmake
> > commands do indeed work consistently as documented from cmake 3.7
> onwards.
> >
> > To give users a chance to update their tooling I'd also suggest we
> > begin warning users at least a release in advance that make based
> > builds will be deprecated in MXNet 2.0 so they can begin migrating to
> > cmake.  I'd also want to display deprecation messages for unused cmake
> > flags (such as the profiler flag) for a release before the 2.0
> > release, and then remove them in 2.0.
> >
> > Of course not all users have cmake 3.7 on their systems, some of our
> > employers force use to use ridiculously outdated linux distributions.
> > The good news for these users is that if we can offer Docker
> > compilation with an image that has a supported version of cmake and we
> > should be able to build a portable binary that work even with very old
> > distributions of Linux.  Additionally installing cmake from source is
> >

Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system

2019-04-04 Thread Qing Lan
+1 to have a single build system

Currently the way of publish and the way of doing CI test is very different. 
The instruction shown on the website should match the way we deliver it to the 
users.
Having a single build process would simplify the maintainance cost and reach to 
the best performance.

Thanks,
Qing


From: Marco de Abreu 
Sent: Thursday, April 4, 2019 15:01
To: dev@mxnet.incubator.apache.org
Subject: Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system

+1 towards having a single build system

I'd like to add the benefit of this approach allowing us to have the same
build logic across all operating systems. It would be great if we could
make x86/Unix, x86/windows, x86/mac and ARM/Unix first class citizens from
the beginning.

-Marco

Kellen Sunderland  schrieb am Do., 4. Apr. 2019, 12:31:

> Hello MXNet devs,
>
> I'd like to start a thread discussing what our build system should look
> like in MXNet 2.0.  I'd propose that although the current make system has
> served us well in the past, we remove it along with the bump to 2.0.  The
> end goal I'd like to see is that we have a clean build system, without a
> bunch of conditional logic that makes contributing and testing MXNet a
> simpler process.  Additionally I'd propose we target a minimum cmake
> version of 3.7 for reasons described below.
>
> First I'd like to give some context on why I'd propose we don't just switch
> to cmake, but we also target a relatively new version (version 3.7 from
> Nov, 2016) of cmake.  The largest benefits in making this change would
> apply to CUDA builds where cmake itself has quite inconsistent
> functionality between versions.  One persistent annoyance I've had with
> cmake is that we've had conditional logic for the FindCUDA command which at
> one point targeted some modern cmake features, but then in subsequent
> versions of cmake the way these features works was tweaked, and now I find
> these cmake features are consistently broken to the point that I require a
> bunch of -D defines to compile properly or to use an IDE.  An additional
> CUDA related issue is that every time there's a new SM added to NVCC we
> have to make a few source changes to support it.  I could see this being
> problematic for users who may suddenly realize that due to their
> compilation settings, they may not actually be enabling the features they
> think they are with their shiny new GPUs.
>
> As an alternative if we, for example, target cmake 3.7 at a minimum, and we
> want to find cuda and then build a list of reasonable PTX/BINS we could use
> the following command[1]:
>
> 
> FindCUDA(...)
> ...
> CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
>   LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
> 
>
> Simple, concise, and it would help to make the building experience more
> consistent across platforms, build environments and IDEs (looking at you
> CLion).  We'd of course need to do a little experimentation work to make
> sure that this does indeed work as intended, and can replace the currently
> complex findCuda logic we have in our build systems, but for the sake of
> the proposal let's assume these cmake commands do indeed work consistently
> as documented from cmake 3.7 onwards.
>
> To give users a chance to update their tooling I'd also suggest we begin
> warning users at least a release in advance that make based builds will be
> deprecated in MXNet 2.0 so they can begin migrating to cmake.  I'd also
> want to display deprecation messages for unused cmake flags (such as the
> profiler flag) for a release before the 2.0 release, and then remove them
> in 2.0.
>
> Of course not all users have cmake 3.7 on their systems, some of our
> employers force use to use ridiculously outdated linux distributions.  The
> good news for these users is that if we can offer Docker compilation with
> an image that has a supported version of cmake and we should be able to
> build a portable binary that work even with very old distributions of
> Linux.  Additionally installing cmake from source is also fairly
> straightforward [2] and works quite well on older distros in my experience.
>
> Looking forward to hearing what others think.  Any preferred build systems
> that you all would want to use?  Is cmake the right system to centralize
> on?  If so, is version 3.7 a reasonable minimum version to target?  Is the
> 2.0 release a good point at which we can think about simplifying build
> logic?
>
> 1: https://cmake.org/cmake/help/v3.7/module/FindCUDA.html
> 2: https://github.com/Kitware/CMake
>


Re: MXNet 1.4.1 Release Proposal

2019-04-04 Thread Sheng Zha
Thanks Hagay for proposing the release and for Junru to volunteer to drive
the release. I will help Junru as the committer for this release.

-sz

On Thu, Apr 4, 2019 at 2:18 PM Junru Shao  wrote:

> Hi Hagay,
>
> I have some experiences in MXNet development, and would love to volunteer
> for driving this release.
>
> Thank you so much!
>
> Best,
> Junru
>
> On Thu, Apr 4, 2019 at 1:51 PM Hagay Lupesko  wrote:
>
> > Hello MXNet community,
> >
> > As previously discussed in [0
> > <
> >
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> > >],
> > and per the feedback from Pedro, Kellen and Sheng, I'd like to propose
> > releasing MXNet 1.4.1.
> > MXNet 1.4.1 is a patch release on top of 1.4.0 (following semver[1
> > ]), that includes backwards compatible bug fixes -
> a
> > couple I am aware of are mem leaks in Scala API, Gluon RNN and NDArrays.
> >
> > I went ahead and created a draft release page on CWiki [2
> > <
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> > >],
> > thanks to Yuxi Hu for adding a mem leak fix, and thanks to Andrew Ayres,
> > Qing Lan and Sergey Sokolov for fixing bugs in 1.4.0 - I went ahead and
> > added your fixes to the list.
> >
> > Asking the community to:
> > (1) Any bug fix or regression you identified and fixed after 1.4.0
> release?
> > please add it to the release proposal wiki (or msg me on Slack if you
> don't
> > have write access, happy to do it).
> > (2) Any comments or suggestions on the release wiki? please leave
> comments
> > on the wiki or reply to this email.
> > (3) I am looking for volunteers to drive the release - ideally we'll have
> > two volunteers: a non-committer and a shepherd committer that can also
> help
> > with the logistics that require permissions. This is a great way to
> > contribute to the community and help MXNet!
> >
> > I plan to check-in in a few days and finalize the proposal, so timely
> > response is appreciated.
> >
> > Cheers,
> > Hagay
> >
> > [0]
> >
> >
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> > [1] https://semver.org/
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> >
>


Re: MXNet 1.4.1 Release Proposal

2019-04-04 Thread Junru Shao
Hi Hagay,

I have some experiences in MXNet development, and would love to volunteer
for driving this release.

Thank you so much!

Best,
Junru

On Thu, Apr 4, 2019 at 1:51 PM Hagay Lupesko  wrote:

> Hello MXNet community,
>
> As previously discussed in [0
> <
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> >],
> and per the feedback from Pedro, Kellen and Sheng, I'd like to propose
> releasing MXNet 1.4.1.
> MXNet 1.4.1 is a patch release on top of 1.4.0 (following semver[1
> ]), that includes backwards compatible bug fixes - a
> couple I am aware of are mem leaks in Scala API, Gluon RNN and NDArrays.
>
> I went ahead and created a draft release page on CWiki [2
> <
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> >],
> thanks to Yuxi Hu for adding a mem leak fix, and thanks to Andrew Ayres,
> Qing Lan and Sergey Sokolov for fixing bugs in 1.4.0 - I went ahead and
> added your fixes to the list.
>
> Asking the community to:
> (1) Any bug fix or regression you identified and fixed after 1.4.0 release?
> please add it to the release proposal wiki (or msg me on Slack if you don't
> have write access, happy to do it).
> (2) Any comments or suggestions on the release wiki? please leave comments
> on the wiki or reply to this email.
> (3) I am looking for volunteers to drive the release - ideally we'll have
> two volunteers: a non-committer and a shepherd committer that can also help
> with the logistics that require permissions. This is a great way to
> contribute to the community and help MXNet!
>
> I plan to check-in in a few days and finalize the proposal, so timely
> response is appreciated.
>
> Cheers,
> Hagay
>
> [0]
>
> https://lists.apache.org/thread.html/a5f444999bf428d06e691b1856392ae5ebb24a3485eaa484a73de10d@%3Cdev.mxnet.apache.org%3E
> [1] https://semver.org/
> [2]
>
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
>


Re: Discussing plans for next MXNet releases

2019-04-04 Thread Hagay Lupesko
Thanks Kellen, Pedro and Sheng for the feedback.

Kellen -
- Thanks for proposing 1.5 features. Kindly note them on the issue Sheng
created: https://github.com/apache/incubator-mxnet/issues/14619
- For your 2.0 proposals - can you please update them in the issue Sheng
created: https://github.com/apache/incubator-mxnet/issues/9686
Pedro -
- Thank you for volunteering for 2.0 release!
- Thanks for referencing the Issue that tracks 2.0 API updates, looks like
Sheng updated it to track the broader 2.0 features.
Sheng -
- Agreed that each release should be managed separately, my intention was
to kick start thinking for MXNet short term and long term roadmap - we can
fork at this point.
- Thanks for creating the issues for 1.5 and 2.0 - the community can start
surfacing proposals there.
- Agreed that 1.4.1 should include fixes, not features. I'll start a
separate thread on that.

As discussed - we will have separate threads for each of the releases, and
I will start with 1.4.1

Cheers,
Hagay


On Tue, Apr 2, 2019 at 6:39 PM Sheng Zha  wrote:

> Hi Hagay,
>
> Thanks for taking the initiative. The proposed scope in this thread is in
> my opinion too large to fit in a single thread, so I'd suggest that we
> start separate threads for each individual release item. To elaborate on
> the reasons based on each individual item:
> - For 1.4.1 which is in the wiki page draft, I'd suggest refraining from
> adding new features there since patch release should be about bug fixes.
> - For 1.5, there are efforts such as AMP and general improvement for fp16
> support in operators, quantization efforts, etc., that should be included.
> I may have a bit more context on this so I'm happy to help initiate the
> discussion.
> - For 2.0, I think it would be more of a roadmap discussion at this stage.
>
> I hope this makes sense. Would you mind starting a thread focusing on 1.4.1
> patch release?
>
> -sz
>
>
> On Tue, Apr 2, 2019 at 5:06 PM Hagay Lupesko  wrote:
>
> > Dear MXNet community,
> >
> > I wanted to initiate a discussion about the plan and scope for the next
> > MXNet releases.
> > I suggest we focus on three releases, and get the process going in
> > parallel:
> > (1) 1.4.1 - patch release on top of 1.4.0 to address some perf
> regressions
> > and memory leaks I am aware of, such as the memory leak fixed on Scala [0
> > ]. I went ahead
> and
> > created a draft release proposal wiki [1
> > <
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> > >
> > ].
> > (2) 1.5.0 - a minor release to add new features introduced since 1.4.0
> > release started (back in Nov 2018!), such as various performance
> > improvements: aggregate SGD, in-place updates in optimizers, gpu support
> > for image processing operators and many more features useful for MXNet’s
> > users.
> > (3) 2.0 - an exciting major release that will include major enhancements
> to
> > MXNet.
> >
> > Timeframes will probably vary based on the scope. I think we should plan
> to
> > start 1.4.1 release within a couple of weeks, 1.5.0 should target
> starting
> > once we release 1.4.1, and 2.0 timeline is TBD - but such a major release
> > will require more time to discuss and decide in the community.
> >
> > I was thinking to get started through:
> > (1) Draft proposals on CWiki, where the community can add content and
> > propose scope and features.
> > (2) Setup online meetings, where anyone can dial into, from anywhere,
> where
> > we will have a chance to discuss in voice+video.
> > (3) With (1)+(2) have a scope and timeline that the community, in large,
> > supports.
> >
> > Would be great to get the community's feedback and suggestions, and
> please
> > reply if you would like to be involved in the effort of supporting the
> > releases!
> >
> > MXNet is awesome, looking forward to working together to make it even
> > better!
> > Hagay
> >
> > [0] https://github.com/apache/incubator-mxnet/pull/14586
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> >
>


Re: [MXNET 2.0 Wishlist] [DISCUSS] Single build system

2019-04-04 Thread Marco de Abreu
+1 towards having a single build system

I'd like to add the benefit of this approach allowing us to have the same
build logic across all operating systems. It would be great if we could
make x86/Unix, x86/windows, x86/mac and ARM/Unix first class citizens from
the beginning.

-Marco

Kellen Sunderland  schrieb am Do., 4. Apr. 2019, 12:31:

> Hello MXNet devs,
>
> I'd like to start a thread discussing what our build system should look
> like in MXNet 2.0.  I'd propose that although the current make system has
> served us well in the past, we remove it along with the bump to 2.0.  The
> end goal I'd like to see is that we have a clean build system, without a
> bunch of conditional logic that makes contributing and testing MXNet a
> simpler process.  Additionally I'd propose we target a minimum cmake
> version of 3.7 for reasons described below.
>
> First I'd like to give some context on why I'd propose we don't just switch
> to cmake, but we also target a relatively new version (version 3.7 from
> Nov, 2016) of cmake.  The largest benefits in making this change would
> apply to CUDA builds where cmake itself has quite inconsistent
> functionality between versions.  One persistent annoyance I've had with
> cmake is that we've had conditional logic for the FindCUDA command which at
> one point targeted some modern cmake features, but then in subsequent
> versions of cmake the way these features works was tweaked, and now I find
> these cmake features are consistently broken to the point that I require a
> bunch of -D defines to compile properly or to use an IDE.  An additional
> CUDA related issue is that every time there's a new SM added to NVCC we
> have to make a few source changes to support it.  I could see this being
> problematic for users who may suddenly realize that due to their
> compilation settings, they may not actually be enabling the features they
> think they are with their shiny new GPUs.
>
> As an alternative if we, for example, target cmake 3.7 at a minimum, and we
> want to find cuda and then build a list of reasonable PTX/BINS we could use
> the following command[1]:
>
> 
> FindCUDA(...)
> ...
> CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
>   LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
> 
>
> Simple, concise, and it would help to make the building experience more
> consistent across platforms, build environments and IDEs (looking at you
> CLion).  We'd of course need to do a little experimentation work to make
> sure that this does indeed work as intended, and can replace the currently
> complex findCuda logic we have in our build systems, but for the sake of
> the proposal let's assume these cmake commands do indeed work consistently
> as documented from cmake 3.7 onwards.
>
> To give users a chance to update their tooling I'd also suggest we begin
> warning users at least a release in advance that make based builds will be
> deprecated in MXNet 2.0 so they can begin migrating to cmake.  I'd also
> want to display deprecation messages for unused cmake flags (such as the
> profiler flag) for a release before the 2.0 release, and then remove them
> in 2.0.
>
> Of course not all users have cmake 3.7 on their systems, some of our
> employers force use to use ridiculously outdated linux distributions.  The
> good news for these users is that if we can offer Docker compilation with
> an image that has a supported version of cmake and we should be able to
> build a portable binary that work even with very old distributions of
> Linux.  Additionally installing cmake from source is also fairly
> straightforward [2] and works quite well on older distros in my experience.
>
> Looking forward to hearing what others think.  Any preferred build systems
> that you all would want to use?  Is cmake the right system to centralize
> on?  If so, is version 3.7 a reasonable minimum version to target?  Is the
> 2.0 release a good point at which we can think about simplifying build
> logic?
>
> 1: https://cmake.org/cmake/help/v3.7/module/FindCUDA.html
> 2: https://github.com/Kitware/CMake
>


Re: Requesting slack access

2019-04-04 Thread Hagay Lupesko
Hi Xiuquan,

Slack invite sent - welcome to the MXNet community!
Please slack me @Hagay Lupesko - would love to chat about how you guys are
thinking about using MXNet.

Hagay

On Thu, Apr 4, 2019 at 1:24 AM Xiuquan Lv  wrote:

> Dear MXNet community,
>
>
>
>
> Please join me in the MXNet Slack Community.
>
>
>
>
> Thanks
>
> Xiuquan Lv


Re: Reminder: MXNet Berlin User Group

2019-04-04 Thread Chance Bair
Correct, this is a pattern.  As I am a relatively new joiner to MXNet, it
might be good to elaborate on what users would want out of a remote user
group.  I would love to hear suggestions!

Chance Bair



On Thu, Apr 4, 2019 at 2:48 PM Isabel Drost-Fromm  wrote:

>
>
> Am 4. April 2019 13:29:28 MESZ schrieb Chance Bair :
> >Again, there were no attendees.
>
> Is that a pattern, or was that just the case for the past two events?
>
> If the former, maybe we could brainstorm here what could be done to make
> the offer more attractive?
>
>
> Isabel
>
> --
> This message was sent with K-9 from a mobile device with swipe to type
> enabled. I'm sorry for any embarrassing typos that slipped through.
>


Re: Reminder: MXNet Berlin User Group

2019-04-04 Thread Isabel Drost-Fromm



Am 4. April 2019 13:29:28 MESZ schrieb Chance Bair :
>Again, there were no attendees.

Is that a pattern, or was that just the case for the past two events?

If the former, maybe we could brainstorm here what could be done to make the 
offer more attractive?


Isabel

-- 
This message was sent with K-9 from a mobile device with swipe to type enabled. 
I'm sorry for any embarrassing typos that slipped through.


Re: Reminder: MXNet Berlin User Group

2019-04-04 Thread Chance Bair
Again, there were no attendees.

Chance Bair



On Thu, Apr 4, 2019 at 1:07 PM Isabel Drost-Fromm  wrote:

> On Tue, Apr 02, 2019 at 01:50:40PM +0200, Chance Bair wrote:
> > This is a friendly reminder that MXNet Berlin User Group will be held
> today
> > at 6pm-7pm (CEST) / 9am-10am (PST). More info here:
> >
> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28Incubating%29+User+Groups+recurring+meetings
>
> Same question as last week: Would you please share more information on how
> the
> event turned out in terms of attendees, topics etc.?
>
>
> Isabel
>
> --
> Sorry for any typos: Mail was typed in vim, written in mutt, via ssh (most
> likely involving some kind of mobile connection only.)
>


Re: Reminder: MXNet Berlin User Group

2019-04-04 Thread Isabel Drost-Fromm
On Tue, Apr 02, 2019 at 01:50:40PM +0200, Chance Bair wrote:
> This is a friendly reminder that MXNet Berlin User Group will be held today
> at 6pm-7pm (CEST) / 9am-10am (PST). More info here:
> https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28Incubating%29+User+Groups+recurring+meetings

Same question as last week: Would you please share more information on how the
event turned out in terms of attendees, topics etc.?


Isabel

-- 
Sorry for any typos: Mail was typed in vim, written in mutt, via ssh (most 
likely involving some kind of mobile connection only.)


Re: Podling Report Reminder - April 2019

2019-04-03 Thread Hagay Lupesko
Sounds good Sheng, thanks!

On Tue, Apr 2, 2019, 17:26 Sheng Zha  wrote:

> Thanks for the reminder. I’m working on it and will post the draft back to
> the list, and would appreciate feedback from the community by then.
>
> -sz
>
> > On Apr 2, 2019, at 5:23 PM, Tianqi Chen 
> wrote:
> >
> > It would be great if the PPMC coordinate and prepare the report
> >
> >> On Tue, Apr 2, 2019 at 4:00 PM Hagay Lupesko  wrote:
> >>
> >> Is anyone working on the podling report?
> >> I'm happy to take care of that if no one else is planning to do it.
> >>
> >> Cheers,
> >> Hagay
> >>
> >>> On Fri, Mar 29, 2019 at 4:06 PM  wrote:
> >>>
> >>> Dear podling,
> >>>
> >>> This email was sent by an automated system on behalf of the Apache
> >>> Incubator PMC. It is an initial reminder to give you plenty of time to
> >>> prepare your quarterly board report.
> >>>
> >>> The board meeting is scheduled for Wed, 17 April 2019, 10:30 am PDT.
> >>> The report for your podling will form a part of the Incubator PMC
> >>> report. The Incubator PMC requires your report to be submitted 2 weeks
> >>> before the board meeting, to allow sufficient time for review and
> >>> submission (Wed, April 03).
> >>>
> >>> Please submit your report with sufficient time to allow the Incubator
> >>> PMC, and subsequently board members to review and digest. Again, the
> >>> very latest you should submit your report is 2 weeks prior to the board
> >>> meeting.
> >>>
> >>> Candidate names should not be made public before people are actually
> >>> elected, so please do not include the names of potential committers or
> >>> PPMC members in your report.
> >>>
> >>> Thanks,
> >>>
> >>> The Apache Incubator PMC
> >>>
> >>> Submitting your Report
> >>>
> >>> --
> >>>
> >>> Your report should contain the following:
> >>>
> >>> *   Your project name
> >>> *   A brief description of your project, which assumes no knowledge of
> >>>the project or necessarily of its field
> >>> *   A list of the three most important issues to address in the move
> >>>towards graduation.
> >>> *   Any issues that the Incubator PMC or ASF Board might wish/need to
> be
> >>>aware of
> >>> *   How has the community developed since the last report
> >>> *   How has the project developed since the last report.
> >>> *   How does the podling rate their own maturity.
> >>>
> >>> This should be appended to the Incubator Wiki page at:
> >>>
> >>> https://wiki.apache.org/incubator/April2019
> >>>
> >>> Note: This is manually populated. You may need to wait a little before
> >>> this page is created from a template.
> >>>
> >>> Mentors
> >>> ---
> >>>
> >>> Mentors should review reports for their project(s) and sign them off on
> >>> the Incubator wiki page. Signing off reports shows that you are
> >>> following the project - projects that are not signed may raise alarms
> >>> for the Incubator PMC.
> >>>
> >>> Incubator PMC
> >>>
> >>
>


Re: Discussing plans for next MXNet releases

2019-04-02 Thread Sheng Zha
Hi Hagay,

Thanks for taking the initiative. The proposed scope in this thread is in
my opinion too large to fit in a single thread, so I'd suggest that we
start separate threads for each individual release item. To elaborate on
the reasons based on each individual item:
- For 1.4.1 which is in the wiki page draft, I'd suggest refraining from
adding new features there since patch release should be about bug fixes.
- For 1.5, there are efforts such as AMP and general improvement for fp16
support in operators, quantization efforts, etc., that should be included.
I may have a bit more context on this so I'm happy to help initiate the
discussion.
- For 2.0, I think it would be more of a roadmap discussion at this stage.

I hope this makes sense. Would you mind starting a thread focusing on 1.4.1
patch release?

-sz


On Tue, Apr 2, 2019 at 5:06 PM Hagay Lupesko  wrote:

> Dear MXNet community,
>
> I wanted to initiate a discussion about the plan and scope for the next
> MXNet releases.
> I suggest we focus on three releases, and get the process going in
> parallel:
> (1) 1.4.1 - patch release on top of 1.4.0 to address some perf regressions
> and memory leaks I am aware of, such as the memory leak fixed on Scala [0
> ]. I went ahead and
> created a draft release proposal wiki [1
> <
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> >
> ].
> (2) 1.5.0 - a minor release to add new features introduced since 1.4.0
> release started (back in Nov 2018!), such as various performance
> improvements: aggregate SGD, in-place updates in optimizers, gpu support
> for image processing operators and many more features useful for MXNet’s
> users.
> (3) 2.0 - an exciting major release that will include major enhancements to
> MXNet.
>
> Timeframes will probably vary based on the scope. I think we should plan to
> start 1.4.1 release within a couple of weeks, 1.5.0 should target starting
> once we release 1.4.1, and 2.0 timeline is TBD - but such a major release
> will require more time to discuss and decide in the community.
>
> I was thinking to get started through:
> (1) Draft proposals on CWiki, where the community can add content and
> propose scope and features.
> (2) Setup online meetings, where anyone can dial into, from anywhere, where
> we will have a chance to discuss in voice+video.
> (3) With (1)+(2) have a scope and timeline that the community, in large,
> supports.
>
> Would be great to get the community's feedback and suggestions, and please
> reply if you would like to be involved in the effort of supporting the
> releases!
>
> MXNet is awesome, looking forward to working together to make it even
> better!
> Hagay
>
> [0] https://github.com/apache/incubator-mxnet/pull/14586
> [1]
>
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
>


Re: Discussing plans for next MXNet releases

2019-04-02 Thread Pedro Larroy
Great initiative.

I would like to add the issue that tracks APIs that we would like to
break for 2.0 so we can take the chance to streamline and improve
customer facing code:

https://github.com/apache/incubator-mxnet/issues/9686

I would be happy to volunteer for 2.0 release with assistance from a comitter.

Pedro.

On Tue, Apr 2, 2019 at 5:06 PM Hagay Lupesko  wrote:
>
> Dear MXNet community,
>
> I wanted to initiate a discussion about the plan and scope for the next
> MXNet releases.
> I suggest we focus on three releases, and get the process going in parallel:
> (1) 1.4.1 - patch release on top of 1.4.0 to address some perf regressions
> and memory leaks I am aware of, such as the memory leak fixed on Scala [0
> ]. I went ahead and
> created a draft release proposal wiki [1
> 
> ].
> (2) 1.5.0 - a minor release to add new features introduced since 1.4.0
> release started (back in Nov 2018!), such as various performance
> improvements: aggregate SGD, in-place updates in optimizers, gpu support
> for image processing operators and many more features useful for MXNet’s
> users.
> (3) 2.0 - an exciting major release that will include major enhancements to
> MXNet.
>
> Timeframes will probably vary based on the scope. I think we should plan to
> start 1.4.1 release within a couple of weeks, 1.5.0 should target starting
> once we release 1.4.1, and 2.0 timeline is TBD - but such a major release
> will require more time to discuss and decide in the community.
>
> I was thinking to get started through:
> (1) Draft proposals on CWiki, where the community can add content and
> propose scope and features.
> (2) Setup online meetings, where anyone can dial into, from anywhere, where
> we will have a chance to discuss in voice+video.
> (3) With (1)+(2) have a scope and timeline that the community, in large,
> supports.
>
> Would be great to get the community's feedback and suggestions, and please
> reply if you would like to be involved in the effort of supporting the
> releases!
>
> MXNet is awesome, looking forward to working together to make it even
> better!
> Hagay
>
> [0] https://github.com/apache/incubator-mxnet/pull/14586
> [1]
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status


Re: Discussing plans for next MXNet releases

2019-04-02 Thread kellen sunderland
Release breakdown makes sense to me Hagay.  Thanks for initiating a
discussion.

Some features that I'm personally looking forward to that I hope can make
it into 1.5 (schedule permitting):
*  TensorRT being integrated with the subgraph API
*  VNNI MKLDNN support
*  AMP training in MXNet

I like the idea of having a call to align on features for the 1.5 release.
For those unable to dial in we have a rapporteur who can send notes around
after the meeting.

For the 2.0 release I wonder if we could start a thread that would have a
list of big changes/features people would like to see.  I know there's been
a few changes I've made that required sub-optimal implementations to avoid
a breaking change.  This could be a good opportunity to clean up prior
work.  It'd also be a good opportunity to prune our operators to those that
are well supported, and to make sure they're named and structured in an
understandable way for users and contributors.

-Kellen

On Tue, Apr 2, 2019 at 5:06 PM Hagay Lupesko  wrote:

> Dear MXNet community,
>
> I wanted to initiate a discussion about the plan and scope for the next
> MXNet releases.
> I suggest we focus on three releases, and get the process going in
> parallel:
> (1) 1.4.1 - patch release on top of 1.4.0 to address some perf regressions
> and memory leaks I am aware of, such as the memory leak fixed on Scala [0
> ]. I went ahead and
> created a draft release proposal wiki [1
> <
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
> >
> ].
> (2) 1.5.0 - a minor release to add new features introduced since 1.4.0
> release started (back in Nov 2018!), such as various performance
> improvements: aggregate SGD, in-place updates in optimizers, gpu support
> for image processing operators and many more features useful for MXNet’s
> users.
> (3) 2.0 - an exciting major release that will include major enhancements to
> MXNet.
>
> Timeframes will probably vary based on the scope. I think we should plan to
> start 1.4.1 release within a couple of weeks, 1.5.0 should target starting
> once we release 1.4.1, and 2.0 timeline is TBD - but such a major release
> will require more time to discuss and decide in the community.
>
> I was thinking to get started through:
> (1) Draft proposals on CWiki, where the community can add content and
> propose scope and features.
> (2) Setup online meetings, where anyone can dial into, from anywhere, where
> we will have a chance to discuss in voice+video.
> (3) With (1)+(2) have a scope and timeline that the community, in large,
> supports.
>
> Would be great to get the community's feedback and suggestions, and please
> reply if you would like to be involved in the effort of supporting the
> releases!
>
> MXNet is awesome, looking forward to working together to make it even
> better!
> Hagay
>
> [0] https://github.com/apache/incubator-mxnet/pull/14586
> [1]
>
> https://cwiki.apache.org/confluence/display/MXNET/%5BDRAFT+PROPOSAL%5D+Apache+MXNet+%28incubating%29+1.4.1+Release+Plan+and+Status
>


Re: Podling Report Reminder - April 2019

2019-04-02 Thread Sheng Zha
Thanks for the reminder. I’m working on it and will post the draft back to the 
list, and would appreciate feedback from the community by then.

-sz

> On Apr 2, 2019, at 5:23 PM, Tianqi Chen  wrote:
> 
> It would be great if the PPMC coordinate and prepare the report
> 
>> On Tue, Apr 2, 2019 at 4:00 PM Hagay Lupesko  wrote:
>> 
>> Is anyone working on the podling report?
>> I'm happy to take care of that if no one else is planning to do it.
>> 
>> Cheers,
>> Hagay
>> 
>>> On Fri, Mar 29, 2019 at 4:06 PM  wrote:
>>> 
>>> Dear podling,
>>> 
>>> This email was sent by an automated system on behalf of the Apache
>>> Incubator PMC. It is an initial reminder to give you plenty of time to
>>> prepare your quarterly board report.
>>> 
>>> The board meeting is scheduled for Wed, 17 April 2019, 10:30 am PDT.
>>> The report for your podling will form a part of the Incubator PMC
>>> report. The Incubator PMC requires your report to be submitted 2 weeks
>>> before the board meeting, to allow sufficient time for review and
>>> submission (Wed, April 03).
>>> 
>>> Please submit your report with sufficient time to allow the Incubator
>>> PMC, and subsequently board members to review and digest. Again, the
>>> very latest you should submit your report is 2 weeks prior to the board
>>> meeting.
>>> 
>>> Candidate names should not be made public before people are actually
>>> elected, so please do not include the names of potential committers or
>>> PPMC members in your report.
>>> 
>>> Thanks,
>>> 
>>> The Apache Incubator PMC
>>> 
>>> Submitting your Report
>>> 
>>> --
>>> 
>>> Your report should contain the following:
>>> 
>>> *   Your project name
>>> *   A brief description of your project, which assumes no knowledge of
>>>the project or necessarily of its field
>>> *   A list of the three most important issues to address in the move
>>>towards graduation.
>>> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
>>>aware of
>>> *   How has the community developed since the last report
>>> *   How has the project developed since the last report.
>>> *   How does the podling rate their own maturity.
>>> 
>>> This should be appended to the Incubator Wiki page at:
>>> 
>>> https://wiki.apache.org/incubator/April2019
>>> 
>>> Note: This is manually populated. You may need to wait a little before
>>> this page is created from a template.
>>> 
>>> Mentors
>>> ---
>>> 
>>> Mentors should review reports for their project(s) and sign them off on
>>> the Incubator wiki page. Signing off reports shows that you are
>>> following the project - projects that are not signed may raise alarms
>>> for the Incubator PMC.
>>> 
>>> Incubator PMC
>>> 
>> 


Re: Podling Report Reminder - April 2019

2019-04-02 Thread Tianqi Chen
It would be great if the PPMC coordinate and prepare the report

On Tue, Apr 2, 2019 at 4:00 PM Hagay Lupesko  wrote:

> Is anyone working on the podling report?
> I'm happy to take care of that if no one else is planning to do it.
>
> Cheers,
> Hagay
>
> On Fri, Mar 29, 2019 at 4:06 PM  wrote:
>
> > Dear podling,
> >
> > This email was sent by an automated system on behalf of the Apache
> > Incubator PMC. It is an initial reminder to give you plenty of time to
> > prepare your quarterly board report.
> >
> > The board meeting is scheduled for Wed, 17 April 2019, 10:30 am PDT.
> > The report for your podling will form a part of the Incubator PMC
> > report. The Incubator PMC requires your report to be submitted 2 weeks
> > before the board meeting, to allow sufficient time for review and
> > submission (Wed, April 03).
> >
> > Please submit your report with sufficient time to allow the Incubator
> > PMC, and subsequently board members to review and digest. Again, the
> > very latest you should submit your report is 2 weeks prior to the board
> > meeting.
> >
> > Candidate names should not be made public before people are actually
> > elected, so please do not include the names of potential committers or
> > PPMC members in your report.
> >
> > Thanks,
> >
> > The Apache Incubator PMC
> >
> > Submitting your Report
> >
> > --
> >
> > Your report should contain the following:
> >
> > *   Your project name
> > *   A brief description of your project, which assumes no knowledge of
> > the project or necessarily of its field
> > *   A list of the three most important issues to address in the move
> > towards graduation.
> > *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> > aware of
> > *   How has the community developed since the last report
> > *   How has the project developed since the last report.
> > *   How does the podling rate their own maturity.
> >
> > This should be appended to the Incubator Wiki page at:
> >
> > https://wiki.apache.org/incubator/April2019
> >
> > Note: This is manually populated. You may need to wait a little before
> > this page is created from a template.
> >
> > Mentors
> > ---
> >
> > Mentors should review reports for their project(s) and sign them off on
> > the Incubator wiki page. Signing off reports shows that you are
> > following the project - projects that are not signed may raise alarms
> > for the Incubator PMC.
> >
> > Incubator PMC
> >
>


Re: Podling Report Reminder - April 2019

2019-04-02 Thread Hagay Lupesko
Is anyone working on the podling report?
I'm happy to take care of that if no one else is planning to do it.

Cheers,
Hagay

On Fri, Mar 29, 2019 at 4:06 PM  wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 17 April 2019, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, April 03).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Candidate names should not be made public before people are actually
> elected, so please do not include the names of potential committers or
> PPMC members in your report.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
> *   How does the podling rate their own maturity.
>
> This should be appended to the Incubator Wiki page at:
>
> https://wiki.apache.org/incubator/April2019
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>


Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-04-02 Thread Lieven Govaerts
Hi,


On Thu, 28 Mar 2019 at 22:08, Mu Li  wrote:

> The name Gluon is originally used for a collaboration project between
> Amazon and Microsoft [1].
> I pinged both Apache and Amazon legal teams later, they confirmed Gluon is
> not considered as a trademark.
>
> [1]
>
> https://aws.amazon.com/blogs/aws/introducing-gluon-a-new-library-for-machine-learning-from-aws-and-microsoft/
>
> the outcome of that collaboration project as a separate API to multiple
backends seems to be effectively abandoned:

https://github.com/gluon-api/gluon-api

The code ends up in the Apache MXNet codebase, but is never integrated in
Microsoft's CNTK.

https://github.com/Microsoft/CNTK/issues/3049


If the Gluon name is important to the ASF we should treat it as a brand and
at least check if no one else owns the rights to that name. (Microsoft
maybe?)


What was the original plan with GluonCV and GluonNLP?

https://github.com/dmlc/gluon-nlp

https://github.com/dmlc/gluon-cv

While these are separate projects, they share the same Gluon name
with MXNet Gluon, are hosted on MXNet.io, and besides using the ALv2
license the code is also licensed to the ASF:
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.

Are these projects on their way to being integrated in the MXNet project?
Or are they going in the other direction?

regards,

Lieven


> On Thu, Mar 28, 2019 at 2:04 PM Isabel Drost-Fromm 
> wrote:
>
> >
> >
> > Am 28. März 2019 21:53:16 MEZ schrieb Mu Li :
> > >
> > >The reason why we call it GluonCV instead of MXNetCV is because MXNet
> > >is a
> > >trademark owned by Apache, while Gluon doesn't have this issue.
> >
> > Who's the "we" in that sentence?
> >
> > If it doesn't belong to Apache, who owns the Gluon trademark?
> >
> > Isabel
> >
> >
> > --
> > Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
> >
>


Re: [Announcement] New Committer - Alex Zai

2019-04-02 Thread Alex Zai
Thanks everyone. Looking forward to helping mxnet continue to grow.
Best,Alex  





On Mon, Apr 1, 2019 1:38 AM, Lin Yuan apefor...@gmail.com  wrote:
Congrats, Alex! Hope your book is going well :)
Lin
On Sun, Mar 31, 2019 at 6:18 PM Zhao, Patric  wrote:
Congratulation, Alex. 

Thank you to your helps for MKLDNN backend including tests, coverage, CI :)

Looking forward more cooperation together.


> -Original Message-
> From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> Sent: Monday, April 1, 2019 8:56 AM
> To: dev@mxnet.incubator.apache.org
> Cc: Alex Zai 
> Subject: Re: [Announcement] New Committer - Alex Zai
> 
> Congratulation Alex!
> 
> On Sun, Mar 31, 2019 at 4:17 PM Carin Meier 
> wrote:
> 
> > Welcome and congrats!
> >
> > On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian <
> > anirudh2...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > Please join me to welcome Alex Zai as a new committer of Apache
> > > (incubating) MXNet!
> > >
> > > Alex has been instrumental in brining MKLDNN from experimental to
> > > making
> > it
> > > default on MXNet master. This involved adding Python and C++ unit
> > > tests, improving CI coverage for MKLDNN, testing MKLDNN on different
> > > platforms
> > and
> > > working on issues related to MKLDNN.
> > >
> > > PRs:
> > >
> > >
> > https://github.com/apache/incubator-
> mxnet/pulls?utf8=%E2%9C%93=is%3A
> > pr+author%3Aazai91+
> > >
> > > Issues:
> > >
> > >
> > https://github.com/apache/incubator-
> mxnet/issues?utf8=%E2%9C%93=is%3
> > Aissue+involves%3Aazai91
> > >
> > > Reviews:
> > >
> > >
> > https://github.com/apache/incubator-
> mxnet/pulls?page=1=is%3Apr+revie
> > wed-by%3Aazai91=%E2%9C%93
> > >
> > > Dev:
> > > https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai9
> > > 1
> > >
> > > Thanks,
> > >
> > > Anirudh
> > >
> >

Re: [Announcement] New Committer - Alex Zai

2019-04-01 Thread Ravi Kiran Krovvidi
Congrats Alex Zai!

On Sun, Mar 31, 2019 at 8:44 PM Marco de Abreu 
wrote:

> Congratulations and welcome!
>
> -Marco
>
> Lin Yuan  schrieb am So., 31. März 2019, 20:38:
>
> > Congrats, Alex! Hope your book is going well :)
> >
> > Lin
> >
> > On Sun, Mar 31, 2019 at 6:18 PM Zhao, Patric 
> > wrote:
> >
> > > Congratulation, Alex.
> > >
> > > Thank you to your helps for MKLDNN backend including tests, coverage,
> CI
> > :)
> > >
> > > Looking forward more cooperation together.
> > >
> > >
> > > > -Original Message-
> > > > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > > > Sent: Monday, April 1, 2019 8:56 AM
> > > > To: dev@mxnet.incubator.apache.org
> > > > Cc: Alex Zai 
> > > > Subject: Re: [Announcement] New Committer - Alex Zai
> > > >
> > > > Congratulation Alex!
> > > >
> > > > On Sun, Mar 31, 2019 at 4:17 PM Carin Meier 
> > > > wrote:
> > > >
> > > > > Welcome and congrats!
> > > > >
> > > > > On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian <
> > > > > anirudh2...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Please join me to welcome Alex Zai as a new committer of Apache
> > > > > > (incubating) MXNet!
> > > > > >
> > > > > > Alex has been instrumental in brining MKLDNN from experimental to
> > > > > > making
> > > > > it
> > > > > > default on MXNet master. This involved adding Python and C++ unit
> > > > > > tests, improving CI coverage for MKLDNN, testing MKLDNN on
> > different
> > > > > > platforms
> > > > > and
> > > > > > working on issues related to MKLDNN.
> > > > > >
> > > > > > PRs:
> > > > > >
> > > > > >
> > > > > https://github.com/apache/incubator-
> > > > mxnet/pulls?utf8=%E2%9C%93=is%3A
> > > > > pr+author%3Aazai91+
> > > > > >
> > > > > > Issues:
> > > > > >
> > > > > >
> > > > > https://github.com/apache/incubator-
> > > > mxnet/issues?utf8=%E2%9C%93=is%3
> > > > > Aissue+involves%3Aazai91
> > > > > >
> > > > > > Reviews:
> > > > > >
> > > > > >
> > > > > https://github.com/apache/incubator-
> > > > mxnet/pulls?page=1=is%3Apr+revie
> > > > > wed-by%3Aazai91=%E2%9C%93
> > > > > >
> > > > > > Dev:
> > > > > >
> > https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai9
> > > > > > 1
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Anirudh
> > > > > >
> > > > >
> > >
> >
>


Re: Include R-package

2019-04-01 Thread Anirudh Acharya
There was a conversation on this some time back here -
https://lists.apache.org/list.html?d...@mxnet.apache.org:2018-12:Rcpp%20licensing%20in%20Apache%20MXNet


-
Anirudh


On Mon, Apr 1, 2019 at 12:19 PM Zach Kimberg 
wrote:

> As part of the current MXNet release process, the R-package is removed from
> the source release [1]. If we are advertising that MXNet has an R package
> as an Apache project, it really should be part of the official Apache
> release process. I know there were a few missing license headers within the
> package as it is currently excluded from the license check [2]. If someone
> fixes those, are there any other reasons why it can't or shouldn't be
> released?
>
> Zach
>
>
>
> [1] - https://cwiki.apache.org/confluence/display/MXNET/Release+Process
> [2] -
>
> https://github.com/apache/incubator-mxnet/blob/master/tests/nightly/apache_rat_license_check/rat-excludes#L9
>


Re: [Announcement] New Committer - Alex Zai

2019-03-31 Thread Marco de Abreu
Congratulations and welcome!

-Marco

Lin Yuan  schrieb am So., 31. März 2019, 20:38:

> Congrats, Alex! Hope your book is going well :)
>
> Lin
>
> On Sun, Mar 31, 2019 at 6:18 PM Zhao, Patric 
> wrote:
>
> > Congratulation, Alex.
> >
> > Thank you to your helps for MKLDNN backend including tests, coverage, CI
> :)
> >
> > Looking forward more cooperation together.
> >
> >
> > > -Original Message-
> > > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > > Sent: Monday, April 1, 2019 8:56 AM
> > > To: dev@mxnet.incubator.apache.org
> > > Cc: Alex Zai 
> > > Subject: Re: [Announcement] New Committer - Alex Zai
> > >
> > > Congratulation Alex!
> > >
> > > On Sun, Mar 31, 2019 at 4:17 PM Carin Meier 
> > > wrote:
> > >
> > > > Welcome and congrats!
> > > >
> > > > On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian <
> > > > anirudh2...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Please join me to welcome Alex Zai as a new committer of Apache
> > > > > (incubating) MXNet!
> > > > >
> > > > > Alex has been instrumental in brining MKLDNN from experimental to
> > > > > making
> > > > it
> > > > > default on MXNet master. This involved adding Python and C++ unit
> > > > > tests, improving CI coverage for MKLDNN, testing MKLDNN on
> different
> > > > > platforms
> > > > and
> > > > > working on issues related to MKLDNN.
> > > > >
> > > > > PRs:
> > > > >
> > > > >
> > > > https://github.com/apache/incubator-
> > > mxnet/pulls?utf8=%E2%9C%93=is%3A
> > > > pr+author%3Aazai91+
> > > > >
> > > > > Issues:
> > > > >
> > > > >
> > > > https://github.com/apache/incubator-
> > > mxnet/issues?utf8=%E2%9C%93=is%3
> > > > Aissue+involves%3Aazai91
> > > > >
> > > > > Reviews:
> > > > >
> > > > >
> > > > https://github.com/apache/incubator-
> > > mxnet/pulls?page=1=is%3Apr+revie
> > > > wed-by%3Aazai91=%E2%9C%93
> > > > >
> > > > > Dev:
> > > > >
> https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai9
> > > > > 1
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Anirudh
> > > > >
> > > >
> >
>


Re: [Announcement] New Committer - Alex Zai

2019-03-31 Thread Lin Yuan
Congrats, Alex! Hope your book is going well :)

Lin

On Sun, Mar 31, 2019 at 6:18 PM Zhao, Patric  wrote:

> Congratulation, Alex.
>
> Thank you to your helps for MKLDNN backend including tests, coverage, CI :)
>
> Looking forward more cooperation together.
>
>
> > -Original Message-
> > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > Sent: Monday, April 1, 2019 8:56 AM
> > To: dev@mxnet.incubator.apache.org
> > Cc: Alex Zai 
> > Subject: Re: [Announcement] New Committer - Alex Zai
> >
> > Congratulation Alex!
> >
> > On Sun, Mar 31, 2019 at 4:17 PM Carin Meier 
> > wrote:
> >
> > > Welcome and congrats!
> > >
> > > On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian <
> > > anirudh2...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Please join me to welcome Alex Zai as a new committer of Apache
> > > > (incubating) MXNet!
> > > >
> > > > Alex has been instrumental in brining MKLDNN from experimental to
> > > > making
> > > it
> > > > default on MXNet master. This involved adding Python and C++ unit
> > > > tests, improving CI coverage for MKLDNN, testing MKLDNN on different
> > > > platforms
> > > and
> > > > working on issues related to MKLDNN.
> > > >
> > > > PRs:
> > > >
> > > >
> > > https://github.com/apache/incubator-
> > mxnet/pulls?utf8=%E2%9C%93=is%3A
> > > pr+author%3Aazai91+
> > > >
> > > > Issues:
> > > >
> > > >
> > > https://github.com/apache/incubator-
> > mxnet/issues?utf8=%E2%9C%93=is%3
> > > Aissue+involves%3Aazai91
> > > >
> > > > Reviews:
> > > >
> > > >
> > > https://github.com/apache/incubator-
> > mxnet/pulls?page=1=is%3Apr+revie
> > > wed-by%3Aazai91=%E2%9C%93
> > > >
> > > > Dev:
> > > > https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai9
> > > > 1
> > > >
> > > > Thanks,
> > > >
> > > > Anirudh
> > > >
> > >
>


RE: [Announcement] New Committer - Alex Zai

2019-03-31 Thread Zhao, Patric
Congratulation, Alex. 

Thank you to your helps for MKLDNN backend including tests, coverage, CI :)

Looking forward more cooperation together.


> -Original Message-
> From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> Sent: Monday, April 1, 2019 8:56 AM
> To: dev@mxnet.incubator.apache.org
> Cc: Alex Zai 
> Subject: Re: [Announcement] New Committer - Alex Zai
> 
> Congratulation Alex!
> 
> On Sun, Mar 31, 2019 at 4:17 PM Carin Meier 
> wrote:
> 
> > Welcome and congrats!
> >
> > On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian <
> > anirudh2...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > Please join me to welcome Alex Zai as a new committer of Apache
> > > (incubating) MXNet!
> > >
> > > Alex has been instrumental in brining MKLDNN from experimental to
> > > making
> > it
> > > default on MXNet master. This involved adding Python and C++ unit
> > > tests, improving CI coverage for MKLDNN, testing MKLDNN on different
> > > platforms
> > and
> > > working on issues related to MKLDNN.
> > >
> > > PRs:
> > >
> > >
> > https://github.com/apache/incubator-
> mxnet/pulls?utf8=%E2%9C%93=is%3A
> > pr+author%3Aazai91+
> > >
> > > Issues:
> > >
> > >
> > https://github.com/apache/incubator-
> mxnet/issues?utf8=%E2%9C%93=is%3
> > Aissue+involves%3Aazai91
> > >
> > > Reviews:
> > >
> > >
> > https://github.com/apache/incubator-
> mxnet/pulls?page=1=is%3Apr+revie
> > wed-by%3Aazai91=%E2%9C%93
> > >
> > > Dev:
> > > https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai9
> > > 1
> > >
> > > Thanks,
> > >
> > > Anirudh
> > >
> >


Re: [Announcement] New Committer - Alex Zai

2019-03-31 Thread Steffen Rochel
Congratulation Alex!

On Sun, Mar 31, 2019 at 4:17 PM Carin Meier  wrote:

> Welcome and congrats!
>
> On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian <
> anirudh2...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Please join me to welcome Alex Zai as a new committer of Apache
> > (incubating) MXNet!
> >
> > Alex has been instrumental in brining MKLDNN from experimental to making
> it
> > default on MXNet master. This involved adding Python and C++ unit tests,
> > improving CI coverage for MKLDNN, testing MKLDNN on different platforms
> and
> > working on issues related to MKLDNN.
> >
> > PRs:
> >
> >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aazai91+
> >
> > Issues:
> >
> >
> https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Aazai91
> >
> > Reviews:
> >
> >
> https://github.com/apache/incubator-mxnet/pulls?page=1=is%3Apr+reviewed-by%3Aazai91=%E2%9C%93
> >
> > Dev:
> > https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai91
> >
> > Thanks,
> >
> > Anirudh
> >
>


Re: [Announcement] New Committer - Alex Zai

2019-03-31 Thread Carin Meier
Welcome and congrats!

On Sun, Mar 31, 2019 at 12:48 PM Anirudh Subramanian 
wrote:

> Hi all,
>
> Please join me to welcome Alex Zai as a new committer of Apache
> (incubating) MXNet!
>
> Alex has been instrumental in brining MKLDNN from experimental to making it
> default on MXNet master. This involved adding Python and C++ unit tests,
> improving CI coverage for MKLDNN, testing MKLDNN on different platforms and
> working on issues related to MKLDNN.
>
> PRs:
>
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aazai91+
>
> Issues:
>
> https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Aazai91
>
> Reviews:
>
> https://github.com/apache/incubator-mxnet/pulls?page=1=is%3Apr+reviewed-by%3Aazai91=%E2%9C%93
>
> Dev:
> https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai91
>
> Thanks,
>
> Anirudh
>


Re: Requesting slack access

2019-03-30 Thread Naveen Swamy
done. Welcome to Apache MXNet.

On Sat, Mar 30, 2019 at 2:44 PM Luyang Wang  wrote:

>
>


Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-03-29 Thread Hen
(well, the Amazon ping.  For the Apache ping, please share on @apache.org
email;*multiple-hat-syndrome*)

On Fri, Mar 29, 2019 at 11:44 AM Hen  wrote:

> Can you share that ping with me on our @amazon.com emails please Mu. I'd
> like to understand the context better.
>
> On Thu, Mar 28, 2019 at 2:08 PM Mu Li  wrote:
>
>> The name Gluon is originally used for a collaboration project between
>> Amazon and Microsoft [1].
>> I pinged both Apache and Amazon legal teams later, they confirmed Gluon is
>> not considered as a trademark.
>>
>> [1]
>>
>> https://aws.amazon.com/blogs/aws/introducing-gluon-a-new-library-for-machine-learning-from-aws-and-microsoft/
>>
>> On Thu, Mar 28, 2019 at 2:04 PM Isabel Drost-Fromm 
>> wrote:
>>
>> >
>> >
>> > Am 28. März 2019 21:53:16 MEZ schrieb Mu Li :
>> > >
>> > >The reason why we call it GluonCV instead of MXNetCV is because MXNet
>> > >is a
>> > >trademark owned by Apache, while Gluon doesn't have this issue.
>> >
>> > Who's the "we" in that sentence?
>> >
>> > If it doesn't belong to Apache, who owns the Gluon trademark?
>> >
>> > Isabel
>> >
>> >
>> > --
>> > Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
>> >
>>
>


Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-03-29 Thread Hen
Can you share that ping with me on our @amazon.com emails please Mu. I'd
like to understand the context better.

On Thu, Mar 28, 2019 at 2:08 PM Mu Li  wrote:

> The name Gluon is originally used for a collaboration project between
> Amazon and Microsoft [1].
> I pinged both Apache and Amazon legal teams later, they confirmed Gluon is
> not considered as a trademark.
>
> [1]
>
> https://aws.amazon.com/blogs/aws/introducing-gluon-a-new-library-for-machine-learning-from-aws-and-microsoft/
>
> On Thu, Mar 28, 2019 at 2:04 PM Isabel Drost-Fromm 
> wrote:
>
> >
> >
> > Am 28. März 2019 21:53:16 MEZ schrieb Mu Li :
> > >
> > >The reason why we call it GluonCV instead of MXNetCV is because MXNet
> > >is a
> > >trademark owned by Apache, while Gluon doesn't have this issue.
> >
> > Who's the "we" in that sentence?
> >
> > If it doesn't belong to Apache, who owns the Gluon trademark?
> >
> > Isabel
> >
> >
> > --
> > Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
> >
>


Re: The Learning Robot

2019-03-29 Thread Carin Meier
done :)

On Fri, Mar 29, 2019 at 12:31 PM Anton Chernov  wrote:

> Thank you!
>
> Maybe you can retweet mine:
>
> https://twitter.com/lebegus/status/556984824885249
>
> And ApacheMXNet could do that afterwards as well?
>
> Anton
>
> пт, 29 мар. 2019 г. в 14:06, Carin Meier :
>
> > Great story!
> >
> > I would love to retweet it on twitter. Is there anyway to get it out on
> the
> > https://twitter.com/ApacheMXNet account?
> >
> > - Carin
> >
> > On Fri, Mar 29, 2019 at 6:37 AM Anton Chernov 
> wrote:
> >
> > > Dear MXNet Community,
> > >
> > >
> > > Read the development story of a robotics demo powered by deep learning
> > with
> > > Apache MXNet on an embedded platform.
> > >
> > >
> > > The Learning Robot
> > >
> > > Humans and machines, hand in hand.
> > >
> > > https://medium.com/apache-mxnet/the-learning-robot-1c2deab8f375
> > >
> > >
> > > Best
> > >
> > > Anton
> > >
> >
>


Re: The Learning Robot

2019-03-29 Thread Anton Chernov
Thank you!

Maybe you can retweet mine:

https://twitter.com/lebegus/status/556984824885249

And ApacheMXNet could do that afterwards as well?

Anton

пт, 29 мар. 2019 г. в 14:06, Carin Meier :

> Great story!
>
> I would love to retweet it on twitter. Is there anyway to get it out on the
> https://twitter.com/ApacheMXNet account?
>
> - Carin
>
> On Fri, Mar 29, 2019 at 6:37 AM Anton Chernov  wrote:
>
> > Dear MXNet Community,
> >
> >
> > Read the development story of a robotics demo powered by deep learning
> with
> > Apache MXNet on an embedded platform.
> >
> >
> > The Learning Robot
> >
> > Humans and machines, hand in hand.
> >
> > https://medium.com/apache-mxnet/the-learning-robot-1c2deab8f375
> >
> >
> > Best
> >
> > Anton
> >
>


Re: The Learning Robot

2019-03-29 Thread Carin Meier
Great story!

I would love to retweet it on twitter. Is there anyway to get it out on the
https://twitter.com/ApacheMXNet account?

- Carin

On Fri, Mar 29, 2019 at 6:37 AM Anton Chernov  wrote:

> Dear MXNet Community,
>
>
> Read the development story of a robotics demo powered by deep learning with
> Apache MXNet on an embedded platform.
>
>
> The Learning Robot
>
> Humans and machines, hand in hand.
>
> https://medium.com/apache-mxnet/the-learning-robot-1c2deab8f375
>
>
> Best
>
> Anton
>


  1   2   3   4   5   6   7   8   9   10   >