Re: Gluon fit API- Design proposal
It's great to see the proposal has a list of models that this API should cover. But note that the D2L book has a very simplified train function (I wrote it). It's oversimplified compared to what we are using in real life. Kaggle competition solutions and popular github projects are closer to what people are using. Alfredo also provided three very sensible use cases. Besides, there are my comments 1. Do we know the limitation of the current API? E.g. which models that this API does not support. So far, it's hard to say that it covers 80% use cases. 2. Extendibility. So far it has a single gluon.Estimator class do all works. Are we considering to allow to extend this class to support future use cases? 3. It's confusion some variables end with 's' while some don't. For example, https://github.com/apache/incubator-mxnet/blob/fit-api/python/mxnet/gluon/estimator/estimator.py#L54 has loss, metrics, trainers and context. All of them support list inputs. 4. Relate to 3, how to map a list of inputs to each other. E.g. if I give a list of loss functions, then what's their inputs, and what's are the loss weight? Similarly, what does multiple trainers mean. 5. How to initialize a model with different initializers for different layers (pretty common use case) 6. How to restart for a previous 7. How about if a model have multiple components, e.g. encode-decode, or multi-modality training 8. It's strange to specify a batch_size in fit() for new users. (I know the reason, but need to explain to users) 9. event_handler is quit powerful, but how can users to access internal states through the handle. BTW, we usually call it callback in Python. 10. Programming flavor. The codes should be easy to read. For example, use same naming convention as mxnet/pytorch/keras. A function should be short, otherwise breaks it into several pieces. Currently fit() has >100 lines of codes. Despite that there are multiple opportunities to improve, it's great to see that you already have an implementation so we can dive into details. On Tue, Mar 5, 2019 at 10:14 AM Naveen Swamy wrote: > FYI, I have created a branch on the repo to facilitate multiple > collaborators for this feature. > https://github.com/apache/incubator-mxnet/tree/fit-api, they'll create PRs > to this branch and once the api is feature complete, i will rebase and > merge to master to preserve commit history > > On Sun, Feb 10, 2019 at 2:43 PM Hagay Lupesko wrote: > > > Wanted to chime in as well. > > I have reviewed the design shared in the mail offline with Ankit, Lai and > > Naveen (we work in the same team in Amazon). > > > > I think it does a good job at simplifying many low-complexity training > use > > cases, which can make MXNet/Gluon even more friendly to so-called "deep > > learning beginners" - so +1 on the proposal! > > > > Hagay > > > > On Fri, Feb 8, 2019 at 10:30 AM Naveen Swamy wrote: > > > > > Hi Alfredo, > > > Thanks for your comments, I really like all your suggestions. Here are > my > > > answers let me know if it makes sense or have comments. > > > > > > 1) The fit API is targeting novice users covering about 80% of the use > > > cases listed in the document. For advanced users, > > > and complex models, we (Naveen, Ankit and Lai) felt its best use the > > > existing mechanisms due to the imperative nature and the more control > it > > > can give, So we did not duplicate the save/load functionality in the > > Hybrid > > > block. > > > We’ll consider and extend the functionality to Estimator. > > > I have had trouble using pickle package which is commonly used for > > > serialization and deserialization, if you have any other suggestions > from > > > your experience please let us know. > > > > > > 2) +1, we’ll add this to our backlog and add it in our next iteration. > > > > > > 3) Can you expand a little more on this, how it helps in a production > > > environment (which this API was not target for) ?. > > > I’ll check the TF Estimator to understand further. > > > > > > Thanks, Naveen > > > > > > > > > On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque > > > wrote: > > > > > > > This is great and something we should all be able to benefit from. > > > > > > > > There are just three pieces I’d like to advocate for that I feel are > > > > shortcomings of some competing APIs on other frameworks (eg; TF > > > Estimators) > > > > and I would love to see in this proposal: > > > > > > > > 1) Make serialization/deserialization of these classifiers/regressors > > > easy > > > > or at least ensure the internal members of the wrapper are easy to > > > > save/load. We’ve hacked around this by only allowing hybrid blocks > > which > > > > have easy save/load functionality, but having a simple > > > > “save_model”/“load_model” function as a 1st class citizen of these > > > proposed > > > > APIs will lead to a vastly improved user experience down the road. > > > > > > > > 2) Allowing the fit/predict/predict_proba functions to take in both > > data > > > > loaders and sim
Re: Gluon fit API- Design proposal
FYI, I have created a branch on the repo to facilitate multiple collaborators for this feature. https://github.com/apache/incubator-mxnet/tree/fit-api, they'll create PRs to this branch and once the api is feature complete, i will rebase and merge to master to preserve commit history On Sun, Feb 10, 2019 at 2:43 PM Hagay Lupesko wrote: > Wanted to chime in as well. > I have reviewed the design shared in the mail offline with Ankit, Lai and > Naveen (we work in the same team in Amazon). > > I think it does a good job at simplifying many low-complexity training use > cases, which can make MXNet/Gluon even more friendly to so-called "deep > learning beginners" - so +1 on the proposal! > > Hagay > > On Fri, Feb 8, 2019 at 10:30 AM Naveen Swamy wrote: > > > Hi Alfredo, > > Thanks for your comments, I really like all your suggestions. Here are my > > answers let me know if it makes sense or have comments. > > > > 1) The fit API is targeting novice users covering about 80% of the use > > cases listed in the document. For advanced users, > > and complex models, we (Naveen, Ankit and Lai) felt its best use the > > existing mechanisms due to the imperative nature and the more control it > > can give, So we did not duplicate the save/load functionality in the > Hybrid > > block. > > We’ll consider and extend the functionality to Estimator. > > I have had trouble using pickle package which is commonly used for > > serialization and deserialization, if you have any other suggestions from > > your experience please let us know. > > > > 2) +1, we’ll add this to our backlog and add it in our next iteration. > > > > 3) Can you expand a little more on this, how it helps in a production > > environment (which this API was not target for) ?. > > I’ll check the TF Estimator to understand further. > > > > Thanks, Naveen > > > > > > On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque > > wrote: > > > > > This is great and something we should all be able to benefit from. > > > > > > There are just three pieces I’d like to advocate for that I feel are > > > shortcomings of some competing APIs on other frameworks (eg; TF > > Estimators) > > > and I would love to see in this proposal: > > > > > > 1) Make serialization/deserialization of these classifiers/regressors > > easy > > > or at least ensure the internal members of the wrapper are easy to > > > save/load. We’ve hacked around this by only allowing hybrid blocks > which > > > have easy save/load functionality, but having a simple > > > “save_model”/“load_model” function as a 1st class citizen of these > > proposed > > > APIs will lead to a vastly improved user experience down the road. > > > > > > 2) Allowing the fit/predict/predict_proba functions to take in both > data > > > loaders and simple numpy arrays and pandas dataframes is a simple > change > > > but a huge usability improvement. Power users and library authors will > > > appreciate being able to use custom data loaders but a large portion of > > end > > > users want to just pass an ndarray or data frame and get some results > > > quickly. > > > > > > 3) Allow lazy construction of the model. This is something I feel TF > > > Estimators do well: by allowing the user to pass a function that > > constructs > > > the net (i.e a model_fn that returns the net) rather than the net > itself > > it > > > allows for more control at runtime and usage of these APIs in a > > production > > > environment. > > > > > > Would love your thoughts on these three changes/additions. > > > > > > —Alfredo Luque > > > Software Engineer > > > Machine Learning Infrastructure > > > Airbnb > > > San Francisco, CA > > > > > > On February 7, 2019 at 1:51:17 PM, Ankit Khedia ( > khedia.an...@gmail.com) > > > wrote: > > > > > > Hello dev@, > > > > > > Training a model in Gluon requires users to write the training loop, > this > > > is useful because of its imperative nature, however repeating the same > > code > > > across multiple models can become tedious and repetitive with > boilerplate > > > code. The training loop can also be overwhelming to some users new to > > deep > > > learning. Users have asked in [1] for a simple Fit API, similar to APIs > > > available in SKLearn and Keras as a way to simplify model training and > > > reduce boilerplate code and complexity. > > > > > > So, I along with other contributor Naveen and Lai came up with a fit > API > > > proposal in [2] that covers 80% of the use-cases for beginners, the fit > > API > > > does not replace the gluon training loops. The API proposal is inspired > > by > > > the Keras fit API. I have discussed and got feedback from a few MXNet > > > contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask > for > > > the community’s feedback on the API proposal. > > > > > > > > > > > > [1] > > > > > > https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112 > > > [2] > > > > > > > > > https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design > > >
Re: Gluon fit API- Design proposal
STOP On Sun, Feb 10, 2019 at 10:43 PM Hagay Lupesko wrote: > Wanted to chime in as well. > I have reviewed the design shared in the mail offline with Ankit, Lai and > Naveen (we work in the same team in Amazon). > > I think it does a good job at simplifying many low-complexity training use > cases, which can make MXNet/Gluon even more friendly to so-called "deep > learning beginners" - so +1 on the proposal! > > Hagay > > On Fri, Feb 8, 2019 at 10:30 AM Naveen Swamy wrote: > > > Hi Alfredo, > > Thanks for your comments, I really like all your suggestions. Here are my > > answers let me know if it makes sense or have comments. > > > > 1) The fit API is targeting novice users covering about 80% of the use > > cases listed in the document. For advanced users, > > and complex models, we (Naveen, Ankit and Lai) felt its best use the > > existing mechanisms due to the imperative nature and the more control it > > can give, So we did not duplicate the save/load functionality in the > Hybrid > > block. > > We’ll consider and extend the functionality to Estimator. > > I have had trouble using pickle package which is commonly used for > > serialization and deserialization, if you have any other suggestions from > > your experience please let us know. > > > > 2) +1, we’ll add this to our backlog and add it in our next iteration. > > > > 3) Can you expand a little more on this, how it helps in a production > > environment (which this API was not target for) ?. > > I’ll check the TF Estimator to understand further. > > > > Thanks, Naveen > > > > > > On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque > > wrote: > > > > > This is great and something we should all be able to benefit from. > > > > > > There are just three pieces I’d like to advocate for that I feel are > > > shortcomings of some competing APIs on other frameworks (eg; TF > > Estimators) > > > and I would love to see in this proposal: > > > > > > 1) Make serialization/deserialization of these classifiers/regressors > > easy > > > or at least ensure the internal members of the wrapper are easy to > > > save/load. We’ve hacked around this by only allowing hybrid blocks > which > > > have easy save/load functionality, but having a simple > > > “save_model”/“load_model” function as a 1st class citizen of these > > proposed > > > APIs will lead to a vastly improved user experience down the road. > > > > > > 2) Allowing the fit/predict/predict_proba functions to take in both > data > > > loaders and simple numpy arrays and pandas dataframes is a simple > change > > > but a huge usability improvement. Power users and library authors will > > > appreciate being able to use custom data loaders but a large portion of > > end > > > users want to just pass an ndarray or data frame and get some results > > > quickly. > > > > > > 3) Allow lazy construction of the model. This is something I feel TF > > > Estimators do well: by allowing the user to pass a function that > > constructs > > > the net (i.e a model_fn that returns the net) rather than the net > itself > > it > > > allows for more control at runtime and usage of these APIs in a > > production > > > environment. > > > > > > Would love your thoughts on these three changes/additions. > > > > > > —Alfredo Luque > > > Software Engineer > > > Machine Learning Infrastructure > > > Airbnb > > > San Francisco, CA > > > > > > On February 7, 2019 at 1:51:17 PM, Ankit Khedia ( > khedia.an...@gmail.com) > > > wrote: > > > > > > Hello dev@, > > > > > > Training a model in Gluon requires users to write the training loop, > this > > > is useful because of its imperative nature, however repeating the same > > code > > > across multiple models can become tedious and repetitive with > boilerplate > > > code. The training loop can also be overwhelming to some users new to > > deep > > > learning. Users have asked in [1] for a simple Fit API, similar to APIs > > > available in SKLearn and Keras as a way to simplify model training and > > > reduce boilerplate code and complexity. > > > > > > So, I along with other contributor Naveen and Lai came up with a fit > API > > > proposal in [2] that covers 80% of the use-cases for beginners, the fit > > API > > > does not replace the gluon training loops. The API proposal is inspired > > by > > > the Keras fit API. I have discussed and got feedback from a few MXNet > > > contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask > for > > > the community’s feedback on the API proposal. > > > > > > > > > > > > [1] > > > > > > https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112 > > > [2] > > > > > > > > > https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design > > > > > > > > > Thanks, > > > Ankit > > > > > > > > > — > > > Alfredo Luque > > > Software Engineer > > > Machine Learning Infrastructure > > > Airbnb > > > San Francisco, CA > > > > > > -- Best regards, Tommy Pujol
Re: Gluon fit API- Design proposal
Wanted to chime in as well. I have reviewed the design shared in the mail offline with Ankit, Lai and Naveen (we work in the same team in Amazon). I think it does a good job at simplifying many low-complexity training use cases, which can make MXNet/Gluon even more friendly to so-called "deep learning beginners" - so +1 on the proposal! Hagay On Fri, Feb 8, 2019 at 10:30 AM Naveen Swamy wrote: > Hi Alfredo, > Thanks for your comments, I really like all your suggestions. Here are my > answers let me know if it makes sense or have comments. > > 1) The fit API is targeting novice users covering about 80% of the use > cases listed in the document. For advanced users, > and complex models, we (Naveen, Ankit and Lai) felt its best use the > existing mechanisms due to the imperative nature and the more control it > can give, So we did not duplicate the save/load functionality in the Hybrid > block. > We’ll consider and extend the functionality to Estimator. > I have had trouble using pickle package which is commonly used for > serialization and deserialization, if you have any other suggestions from > your experience please let us know. > > 2) +1, we’ll add this to our backlog and add it in our next iteration. > > 3) Can you expand a little more on this, how it helps in a production > environment (which this API was not target for) ?. > I’ll check the TF Estimator to understand further. > > Thanks, Naveen > > > On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque > wrote: > > > This is great and something we should all be able to benefit from. > > > > There are just three pieces I’d like to advocate for that I feel are > > shortcomings of some competing APIs on other frameworks (eg; TF > Estimators) > > and I would love to see in this proposal: > > > > 1) Make serialization/deserialization of these classifiers/regressors > easy > > or at least ensure the internal members of the wrapper are easy to > > save/load. We’ve hacked around this by only allowing hybrid blocks which > > have easy save/load functionality, but having a simple > > “save_model”/“load_model” function as a 1st class citizen of these > proposed > > APIs will lead to a vastly improved user experience down the road. > > > > 2) Allowing the fit/predict/predict_proba functions to take in both data > > loaders and simple numpy arrays and pandas dataframes is a simple change > > but a huge usability improvement. Power users and library authors will > > appreciate being able to use custom data loaders but a large portion of > end > > users want to just pass an ndarray or data frame and get some results > > quickly. > > > > 3) Allow lazy construction of the model. This is something I feel TF > > Estimators do well: by allowing the user to pass a function that > constructs > > the net (i.e a model_fn that returns the net) rather than the net itself > it > > allows for more control at runtime and usage of these APIs in a > production > > environment. > > > > Would love your thoughts on these three changes/additions. > > > > —Alfredo Luque > > Software Engineer > > Machine Learning Infrastructure > > Airbnb > > San Francisco, CA > > > > On February 7, 2019 at 1:51:17 PM, Ankit Khedia (khedia.an...@gmail.com) > > wrote: > > > > Hello dev@, > > > > Training a model in Gluon requires users to write the training loop, this > > is useful because of its imperative nature, however repeating the same > code > > across multiple models can become tedious and repetitive with boilerplate > > code. The training loop can also be overwhelming to some users new to > deep > > learning. Users have asked in [1] for a simple Fit API, similar to APIs > > available in SKLearn and Keras as a way to simplify model training and > > reduce boilerplate code and complexity. > > > > So, I along with other contributor Naveen and Lai came up with a fit API > > proposal in [2] that covers 80% of the use-cases for beginners, the fit > API > > does not replace the gluon training loops. The API proposal is inspired > by > > the Keras fit API. I have discussed and got feedback from a few MXNet > > contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask for > > the community’s feedback on the API proposal. > > > > > > > > [1] > > > https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112 > > [2] > > > > > https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design > > > > > > Thanks, > > Ankit > > > > > > — > > Alfredo Luque > > Software Engineer > > Machine Learning Infrastructure > > Airbnb > > San Francisco, CA > > >
Re: Gluon fit API- Design proposal
Hi Alfredo, Thanks for your comments, I really like all your suggestions. Here are my answers let me know if it makes sense or have comments. 1) The fit API is targeting novice users covering about 80% of the use cases listed in the document. For advanced users, and complex models, we (Naveen, Ankit and Lai) felt its best use the existing mechanisms due to the imperative nature and the more control it can give, So we did not duplicate the save/load functionality in the Hybrid block. We’ll consider and extend the functionality to Estimator. I have had trouble using pickle package which is commonly used for serialization and deserialization, if you have any other suggestions from your experience please let us know. 2) +1, we’ll add this to our backlog and add it in our next iteration. 3) Can you expand a little more on this, how it helps in a production environment (which this API was not target for) ?. I’ll check the TF Estimator to understand further. Thanks, Naveen On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque wrote: > This is great and something we should all be able to benefit from. > > There are just three pieces I’d like to advocate for that I feel are > shortcomings of some competing APIs on other frameworks (eg; TF Estimators) > and I would love to see in this proposal: > > 1) Make serialization/deserialization of these classifiers/regressors easy > or at least ensure the internal members of the wrapper are easy to > save/load. We’ve hacked around this by only allowing hybrid blocks which > have easy save/load functionality, but having a simple > “save_model”/“load_model” function as a 1st class citizen of these proposed > APIs will lead to a vastly improved user experience down the road. > > 2) Allowing the fit/predict/predict_proba functions to take in both data > loaders and simple numpy arrays and pandas dataframes is a simple change > but a huge usability improvement. Power users and library authors will > appreciate being able to use custom data loaders but a large portion of end > users want to just pass an ndarray or data frame and get some results > quickly. > > 3) Allow lazy construction of the model. This is something I feel TF > Estimators do well: by allowing the user to pass a function that constructs > the net (i.e a model_fn that returns the net) rather than the net itself it > allows for more control at runtime and usage of these APIs in a production > environment. > > Would love your thoughts on these three changes/additions. > > —Alfredo Luque > Software Engineer > Machine Learning Infrastructure > Airbnb > San Francisco, CA > > On February 7, 2019 at 1:51:17 PM, Ankit Khedia (khedia.an...@gmail.com) > wrote: > > Hello dev@, > > Training a model in Gluon requires users to write the training loop, this > is useful because of its imperative nature, however repeating the same code > across multiple models can become tedious and repetitive with boilerplate > code. The training loop can also be overwhelming to some users new to deep > learning. Users have asked in [1] for a simple Fit API, similar to APIs > available in SKLearn and Keras as a way to simplify model training and > reduce boilerplate code and complexity. > > So, I along with other contributor Naveen and Lai came up with a fit API > proposal in [2] that covers 80% of the use-cases for beginners, the fit API > does not replace the gluon training loops. The API proposal is inspired by > the Keras fit API. I have discussed and got feedback from a few MXNet > contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask for > the community’s feedback on the API proposal. > > > > [1] > https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112 > [2] > > https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design > > > Thanks, > Ankit > > > — > Alfredo Luque > Software Engineer > Machine Learning Infrastructure > Airbnb > San Francisco, CA >
Re: Gluon fit API- Design proposal
This is great and something we should all be able to benefit from. There are just three pieces I’d like to advocate for that I feel are shortcomings of some competing APIs on other frameworks (eg; TF Estimators) and I would love to see in this proposal: 1) Make serialization/deserialization of these classifiers/regressors easy or at least ensure the internal members of the wrapper are easy to save/load. We’ve hacked around this by only allowing hybrid blocks which have easy save/load functionality, but having a simple “save_model”/“load_model” function as a 1st class citizen of these proposed APIs will lead to a vastly improved user experience down the road. 2) Allowing the fit/predict/predict_proba functions to take in both data loaders and simple numpy arrays and pandas dataframes is a simple change but a huge usability improvement. Power users and library authors will appreciate being able to use custom data loaders but a large portion of end users want to just pass an ndarray or data frame and get some results quickly. 3) Allow lazy construction of the model. This is something I feel TF Estimators do well: by allowing the user to pass a function that constructs the net (i.e a model_fn that returns the net) rather than the net itself it allows for more control at runtime and usage of these APIs in a production environment. Would love your thoughts on these three changes/additions. —Alfredo Luque Software Engineer Machine Learning Infrastructure Airbnb San Francisco, CA On February 7, 2019 at 1:51:17 PM, Ankit Khedia (khedia.an...@gmail.com) wrote: Hello dev@, Training a model in Gluon requires users to write the training loop, this is useful because of its imperative nature, however repeating the same code across multiple models can become tedious and repetitive with boilerplate code. The training loop can also be overwhelming to some users new to deep learning. Users have asked in [1] for a simple Fit API, similar to APIs available in SKLearn and Keras as a way to simplify model training and reduce boilerplate code and complexity. So, I along with other contributor Naveen and Lai came up with a fit API proposal in [2] that covers 80% of the use-cases for beginners, the fit API does not replace the gluon training loops. The API proposal is inspired by the Keras fit API. I have discussed and got feedback from a few MXNet contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask for the community’s feedback on the API proposal. [1] https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112 [2] https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design Thanks, Ankit — Alfredo Luque Software Engineer Machine Learning Infrastructure Airbnb San Francisco, CA
Gluon fit API- Design proposal
Hello dev@, Training a model in Gluon requires users to write the training loop, this is useful because of its imperative nature, however repeating the same code across multiple models can become tedious and repetitive with boilerplate code. The training loop can also be overwhelming to some users new to deep learning. Users have asked in [1] for a simple Fit API, similar to APIs available in SKLearn and Keras as a way to simplify model training and reduce boilerplate code and complexity. So, I along with other contributor Naveen and Lai came up with a fit API proposal in [2] that covers 80% of the use-cases for beginners, the fit API does not replace the gluon training loops. The API proposal is inspired by the Keras fit API. I have discussed and got feedback from a few MXNet contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask for the community’s feedback on the API proposal. [1] https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112 [2] https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design Thanks, Ankit