Re: [Scikit-learn-general] RFC (also by users) on interpreting 1d X

federico vaggi Mon, 04 May 2015 04:33:23 -0700

I think Gael makes a very strong argument, but I think the error should be
as explicit and informative as possible (for new users).


On Fri, May 1, 2015 at 7:58 PM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:

> I strongly advice raising an error. Very very very strongly.
>
> Being lax about ambiguous inputs makes prototyping and interactive usage
> easier: less typing, and the systems gets it right most of the time.
> However, it makes production use and debugging complex code much harder.
> Indeed, errors, that might not be related to a simple user error but
> might be generated by a complex framework, do not lead to exceptions, but
> to problems down the line.
>
> We are not R. We require a bit more of typing, we don't have as many
> shortcuts and magic syntax. But we can be used in production, on big
> datasets. We can be used by people like Airbus to monitor failures of
> part in planes [*], or by many others.
>
> Yes beginners want things to 'just work', but in the long run, they are
> thankful for a well-thought and strict specification.
>
> Gaël
>
>
> [*]
>
> http://www.pyvideo.org/video/3519/scikit-learn-for-predictive-maintenance-at-airbus
>
> On Fri, May 01, 2015 at 06:51:00PM +0100, Luca Puggini wrote:
> > I vote for 3.
>
> > On Fri, May 1, 2015 at 6:27 PM, Andreas Mueller <t3k...@gmail.com>
> wrote:
>
> >     Hi all.
> >     A quick questions on future API.
> >     What should happen if a user passes an X with shape (N,), in other
> words
> >     X.ndim == 1?
>
> >     This is unfortunately not really consistent in scikit-learn right
> now.
> >     Three things are possible:
> >     1) Raise an error
> >     2) N = n_features, that is X contains a single sample
> >     3) N = n_samples, that is X has a single feature
>
> >     I would think it should be N=n_samples. Gael thinks (iirc) we should
> raise
> >     an error.
> >     In the code, we currently take N=n_features in predict,
> decision_function,
> >     predict_proba and transform, basically everywhere.
> >     This is in part due to using ``check_array`` everywhere, which used
> the
> >     backward-compatible (but odd) behavior of np.atleast2d.
>
> >     In ``fit``it looks like all estimators assume N=n_features, apart
> from
> >     DictionaryLearning, MinMaxScaler, StandardScaler, which assume
> N=n_samples.
>
> >     See https://github.com/scikit-learn/scikit-learn/pull/4511 for more
> >     discussion
>
> >     Obviously any change we make would mean a deprecation cycle, which
> will
> >     mean warning in 0.17 and 0.18 when someone gives a 1-dim X that we'll
> >     change something soon, and then actually change it in 0.19 (1.0?).
>
> >     Andy
>
> >
>  
> ------------------------------------------------------------------------------
> >     One dashboard for servers and applications across
> Physical-Virtual-Cloud
> >     Widest out-of-the-box monitoring support with 50+ applications
> >     Performance metrics, stats and reports that give you Actionable
> Insights
> >     Deep dive visibility with transaction tracing using APM Insight.
> >     http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> >     _______________________________________________
> >     Scikit-learn-general mailing list
> >     Scikit-learn-general@lists.sourceforge.net
> >     https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> >
> ------------------------------------------------------------------------------
> > One dashboard for servers and applications across Physical-Virtual-Cloud
> > Widest out-of-the-box monitoring support with 50+ applications
> > Performance metrics, stats and reports that give you Actionable Insights
> > Deep dive visibility with transaction tracing using APM Insight.
> > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>
> > _______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
> --
>     Gael Varoquaux
>     Researcher, INRIA Parietal
>     NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>     Phone:  ++ 33-1-69-08-79-68
>     http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] RFC (also by users) on interpreting 1d X

Reply via email to