Re: Gluon fit API- Design proposal

2019-03-26 Thread Mu Li
It's great to see the proposal has a list of models that this API should
cover. But note that the D2L book has a very simplified train function (I
wrote it). It's oversimplified compared to what we are using in real life.
Kaggle competition solutions and popular github projects are closer to what
people are using. Alfredo also provided three very sensible use cases.

Besides, there are my comments

1. Do we know the limitation of the current API? E.g. which models that
this API does not support. So far, it's hard to say that it covers 80% use
cases.
2. Extendibility. So far it has a single gluon.Estimator class do all
works. Are we considering to allow to extend this class to support future
use cases?
3. It's confusion some variables end with 's' while some don't. For
example,
https://github.com/apache/incubator-mxnet/blob/fit-api/python/mxnet/gluon/estimator/estimator.py#L54
has loss, metrics, trainers and context. All of them support list inputs.
4. Relate to 3, how to map a list of inputs to each other. E.g. if I give a
list of loss functions, then what's their inputs, and what's are the loss
weight? Similarly, what does multiple trainers mean.
5. How to initialize a model with different initializers for different
layers (pretty common use case)
6. How to restart for a previous
7. How about if a model have multiple components, e.g. encode-decode, or
multi-modality training
8. It's strange to specify a batch_size in fit() for new users. (I know the
reason, but need to explain to users)
9. event_handler is quit powerful, but how can users to access internal
states through the handle. BTW, we usually call it callback in Python.
10. Programming flavor. The codes should be easy to read. For example, use
same naming convention as mxnet/pytorch/keras. A function should be short,
otherwise breaks it into several pieces. Currently fit() has >100 lines of
codes.

Despite that there are multiple opportunities to improve, it's great to see
that you already have an implementation so we can dive into details.

On Tue, Mar 5, 2019 at 10:14 AM Naveen Swamy  wrote:

> FYI, I have created a branch on the repo to facilitate multiple
> collaborators for this feature.
> https://github.com/apache/incubator-mxnet/tree/fit-api, they'll create PRs
> to this branch and once the api is feature complete, i will rebase and
> merge to master to preserve commit history
>
> On Sun, Feb 10, 2019 at 2:43 PM Hagay Lupesko  wrote:
>
> > Wanted to chime in as well.
> > I have reviewed the design shared in the mail offline with Ankit, Lai and
> > Naveen (we work in the same team in Amazon).
> >
> > I think it does a good job at simplifying many low-complexity training
> use
> > cases, which can make MXNet/Gluon even more friendly to so-called "deep
> > learning beginners" - so +1 on the proposal!
> >
> > Hagay
> >
> > On Fri, Feb 8, 2019 at 10:30 AM Naveen Swamy  wrote:
> >
> > > Hi Alfredo,
> > > Thanks for your comments, I really like all your suggestions. Here are
> my
> > > answers let me know if it makes sense or have comments.
> > >
> > > 1) The fit API is targeting novice users covering about 80% of the use
> > > cases listed in the document. For advanced users,
> > > and complex models, we (Naveen, Ankit and Lai) felt its best use the
> > > existing mechanisms due to the imperative nature and the more control
> it
> > > can give, So we did not duplicate the save/load functionality in the
> > Hybrid
> > > block.
> > > We’ll consider and extend the functionality to Estimator.
> > > I have had trouble using pickle package which is commonly used for
> > > serialization and deserialization, if you have any other suggestions
> from
> > > your experience please let us know.
> > >
> > > 2) +1, we’ll add this to our backlog and add it in our next iteration.
> > >
> > > 3) Can you expand a little more on this, how it helps in a production
> > > environment (which this API was not target for) ?.
> > > I’ll check the TF Estimator to understand further.
> > >
> > > Thanks, Naveen
> > >
> > >
> > > On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque
> > >  wrote:
> > >
> > > > This is great and something we should all be able to benefit from.
> > > >
> > > > There are just three pieces I’d like to advocate for that I feel are
> > > > shortcomings of some competing APIs on other frameworks (eg; TF
> > > Estimators)
> > > > and I would love to see in this proposal:
> > > >
> > > > 1) Make serialization/deserialization of these classifiers/regressors
> > > easy
> > > > or at least ensure the internal members of the wrapper are easy to
> > > > save/load. We’ve hacked around this by only allowing hybrid blocks
> > which
> > > > have easy save/load functionality, but having a simple
> > > > “save_model”/“load_model” function as a 1st class citizen of these
> > > proposed
> > > > APIs will lead to a vastly improved user experience down the road.
> > > >
> > > > 2) Allowing the fit/predict/predict_proba functions to take in both
> > data
> > > > loaders and 

Re: Call for Ideas and Approaches to Community Building

2019-03-26 Thread Anton Chernov
Here is a demo with some impressions:

https://youtu.be/UwJxLztoI1o

The MXNet blog post will follow soon.

We wanted to show it on GTC as well, but couldn't allocate the needed time.

You can see the code in Thomas repository:

https://github.com/ThomasDelteil/RobotTracker_MXNet

But it's far from being just reusable and lacks documentation.

I could see though that if we get enough time, we would wrap most things
into docker containers, write proper instructions and give the community
the opportunity to contribute and to show it on their own.

Best
Anton


ср, 20 мар. 2019 г. в 17:08, Aaron Markham :

> Anton, can you share the design and specs and code for the robot arm demo?
> I wish that was being shown at GTC now. It would be great to let people
> borrow it for West coast events. Maybe I can get one built here in Palo
> Alto.
>
> On Tue, Mar 19, 2019, 05:54 Anton Chernov  wrote:
>
> > I don't know whether that is enough, but here are a few efforts we make
> to
> > promote MXNet:
> >
> > * The robotic arms demo at the embedded world
> > We promoted MXNet as the framework to go on embedded devices with our
> > robotic arms demo. We've got a lot of attention from different people
> > including professors from multiple universities. A blog post about the
> demo
> > will be posted in the next days MXNet Medium blog [1].
> >
> > Here again some impressions from twitter:
> > https://twitter.com/lebegus/status/1100839414228500485
> >
> > * MLPerf results
> > We intend to publish more benchmark results to MLPerf [2], showing proof
> of
> > the performance advantages of MXNet.
> >
> > * Recurring user group meetings
> > We offer recurring VC meetings [3], free for everyone. We dedicate our
> time
> > to anyone that would like to know more about MXNet or to ask any other
> > related question.
> >
> > * Collaborative meetups
> > We organize meetups with attendants from various companies [4], sharing
> > their interesting use cases and best practises with ML and MXNet.
> >
> > Tracking works and papers on popular science conferences is a valid
> metric,
> > but it's focused on research. More and more people that don't write
> papers
> > use ML and MXNet in production without knowing all the scientific
> details.
> > How to measure how many are out there is an open question.
> >
> > Best
> > Anton
> >
> > [1] https://medium.com/apache-mxnet
> > [2] https://mlperf.org/
> > [3] https://cwiki.apache.org/confluence/x/7BY0BQ
> > [4] https://www.meetup.com/Deep-Learning-with-Apache-MXNet-Berlin
> >
> >
> > вт, 19 мар. 2019 г. в 07:23, Isabel Drost-Fromm :
> >
> > >
> > >
> > > Am 19. März 2019 02:49:23 MEZ schrieb "Zhao, Patric" <
> > > patric.z...@intel.com>:
> > > >I suggest to encourage and fund the students/researchers to present
> > > >their works on the popular conference.
> > > >I know talking is easy but maybe the decision maker can allocate more
> > > >resources for marketing.
> > >
> > > Just for clarity, who exactly do you mean with "the decision maker"?
> > > Decision maker for what?
> > >
> > > On another note, beyond that one conference, which other channels do
> > > people here follow? How did you first hear about mxnet?
> > >
> > >
> > > Isabel
> > >
> > >
> > > --
> > > Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
> > >
> >
>


Re: CI unstable

2019-03-26 Thread Anton Chernov
The fix for the CI system has been merged and the system should be stable
again. You can now rebase all stale PR's.

I have ported the fixes to the latest release branches as well:

Fixes for CI downloads (v1.4.x)
https://github.com/apache/incubator-mxnet/pull/14526

Fixes for CI downloads (v1.3.x)
https://github.com/apache/incubator-mxnet/pull/14525

I would appreciate a review and merge on those.

Best
Anton

пт, 22 мар. 2019 г. в 20:57, Anton Chernov :

> Yes, you can see the changes in this PR:
>
> https://github.com/apache/incubator-mxnet/pull/14504
>
> Unfortunately, there are still a few issues left to fix.
>
> Best
> Anton
>
> пт, 22 марта 2019 г. в 18:10, Mu Li :
>
>> I saw CI is downloading from data.dmlc.ml. Changing it data.mxnet.io
>> should
>> fix this issue. Say
>>
>> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
>> ->
>> http://data.mxnet.io/models/imagenet/inception-bn/Inception-BN-0126.params
>>
>> On Thu, Mar 21, 2019 at 11:57 AM Anton Chernov 
>> wrote:
>>
>> > Dear MXNet Community,
>> >
>> > Since a few days we are experiencing problems with CI PR verification
>> > builds. For some reason unix-cpu builds get aborted. Potentially there
>> is a
>> > problem with gitlab.com from where dependencies are downloaded for
>> static
>> > MXNet builds.
>> >
>> > We are working hard on finding and fixing the issue. Please excuse the
>> > inconvenience.
>> >
>> > Best
>> > Anton
>> >
>>
>


Weekly Berlin User Group

2019-03-26 Thread Chance Bair
Hello Dev,

This is a friendly reminder that the weekly Berlin User Group will be held
today at 6pm-7pm (CEST) / 9am-10am (PST). More info here:
https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28Incubating%29+User+Groups+recurring+meetings

You have been invited to an online meeting, powered by Amazon Chime.

1. Click to join the meeting:

https://chime.aws/4677180533

Meeting ID: 4677 18 0533

2. You can use your computer’s microphone and speakers, however, a headset
is recommended. Or, call in using your phone:

United States Toll-Free: +1 855-552-4463
Meeting PIN: 4677 18 0533

One-click Mobile Dial-in (United States (1)): +1 206-462-5569,,4677180533#

United States (1): +1 206-462-5569
Germany Toll-Free: +49 800 7238486
Germany: +49 89 220 61384
International: https://chime.aws/dialinnumbers/


Thanks!
Chance Bair


Re: R help

2019-03-26 Thread Per da Silva
Hey guys,

Thanks heaps for the help. Indeed, rebasing on top of Anton's changes has
fixed the R issue.

Thanks again!

Per

On Mon, Mar 25, 2019 at 6:50 PM Qing Lan  wrote:

> Change to
> http://data.mxnet.io/models/imagenet/inception-bn/Inception-BN-0126.params
> 
> would work. Please keep track on this PR:
>
> https://github.com/apache/incubator-mxnet/pull/14504
> [https://avatars2.githubusercontent.com/u/1753787?s=400=4]<
> https://github.com/apache/incubator-mxnet/pull/14504>
>
> Fixes for CI downloads by lebeg · Pull Request #14504 ·
> apache/incubator-mxnet<
> https://github.com/apache/incubator-mxnet/pull/14504>
> Description Fixed for CI verification builds. Changes Removed the silent
> curl option for explicit errors Added curl download retries Replaced model
> download links from dmlc to mxnet
> github.com
>
>
>
> Thanks,
> Qing
>
> 
> From: Anirudh Acharya 
> Sent: Monday, March 25, 2019 13:47
> To: dev@mxnet.incubator.apache.org
> Subject: Re: R help
>
> Yes, that is the error, need to dig deeper why that URL is not working.
>
>
> Thanks
> Anirudh
>
>
> On Mon, Mar 25, 2019 at 10:40 AM kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
>
> > Is this the error?
> > "test_model.R:129: error: Fine-tune
> >
> > cannot open URL
> > '
> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> > '
> > 1: GetInception() at R-package/tests/testthat/test_model.R:129
> > 2: download.file("
> >
> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> > ",
> >destfile = "model/Inception-BN-0126.params")"
> >
> > Looks like the
> >
> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> > is failing for me as well.
> >
> >
> > On Mon, Mar 25, 2019 at 10:37 AM Anirudh Acharya 
> > wrote:
> >
> > > Hi Per da Silva,
> > >
> > > Let me know if I can help, we can chat offline.
> > >
> > > From first glance it would seem
> > >
> > >- R:MKLDNN CPU is passing whereas R:CPU is failing
> > >- R:GPU might have failed due to this "cannot open URL '
> > >
> > >
> >
> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> > >'"
> > >
> > >
> > > Thanks
> > > Anirudh
> > >
> > >
> > > On Mon, Mar 25, 2019 at 7:34 AM Per da Silva 
> > wrote:
> > >
> > > > Dear community,
> > > >
> > > > I'm working on a PR <
> > > https://github.com/apache/incubator-mxnet/pull/14513>
> > > > to update CI GPU jobs to be based on CUDA v10. However, for some
> > reason,
> > > > amongst other things, the R tests are failing
> > > > <
> > > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-14513/4/pipeline
> > > > >.
> > > > I would really appreciate some help from the R experts to get it
> sorted
> > > =D
> > > >
> > > > Thanks in advance,
> > > >
> > > > Per
> > > >
> > >
> >
>