Re: [Scikit-learn-general] RFC (also by users) on interpreting 1d X

Arnaud Joly Mon, 04 May 2015 00:58:35 -0700

I am in favour of raising a error.

Arnaud



> On 01 May 2015, at 19:58, Gael Varoquaux <[email protected]> 
> wrote:
> 
> I strongly advice raising an error. Very very very strongly.
> 
> Being lax about ambiguous inputs makes prototyping and interactive usage
> easier: less typing, and the systems gets it right most of the time.
> However, it makes production use and debugging complex code much harder.
> Indeed, errors, that might not be related to a simple user error but
> might be generated by a complex framework, do not lead to exceptions, but
> to problems down the line.
> 
> We are not R. We require a bit more of typing, we don't have as many
> shortcuts and magic syntax. But we can be used in production, on big
> datasets. We can be used by people like Airbus to monitor failures of
> part in planes [*], or by many others.
> 
> Yes beginners want things to 'just work', but in the long run, they are
> thankful for a well-thought and strict specification.
> 
> Gaël
> 
> 
> [*]
> http://www.pyvideo.org/video/3519/scikit-learn-for-predictive-maintenance-at-airbus
> 
> On Fri, May 01, 2015 at 06:51:00PM +0100, Luca Puggini wrote:
>> I vote for 3.
> 
>> On Fri, May 1, 2015 at 6:27 PM, Andreas Mueller <[email protected]> wrote:
> 
>>    Hi all.
>>    A quick questions on future API.
>>    What should happen if a user passes an X with shape (N,), in other words
>>    X.ndim == 1?
> 
>>    This is unfortunately not really consistent in scikit-learn right now.
>>    Three things are possible:
>>    1) Raise an error
>>    2) N = n_features, that is X contains a single sample
>>    3) N = n_samples, that is X has a single feature
> 
>>    I would think it should be N=n_samples. Gael thinks (iirc) we should raise
>>    an error.
>>    In the code, we currently take N=n_features in predict, decision_function,
>>    predict_proba and transform, basically everywhere.
>>    This is in part due to using ``check_array`` everywhere, which used the
>>    backward-compatible (but odd) behavior of np.atleast2d.
> 
>>    In ``fit``it looks like all estimators assume N=n_features, apart from
>>    DictionaryLearning, MinMaxScaler, StandardScaler, which assume 
>> N=n_samples.
> 
>>    See https://github.com/scikit-learn/scikit-learn/pull/4511 for more
>>    discussion
> 
>>    Obviously any change we make would mean a deprecation cycle, which will
>>    mean warning in 0.17 and 0.18 when someone gives a 1-dim X that we'll
>>    change something soon, and then actually change it in 0.19 (1.0?).
> 
>>    Andy
> 
>>    
>> ------------------------------------------------------------------------------
>>    One dashboard for servers and applications across Physical-Virtual-Cloud
>>    Widest out-of-the-box monitoring support with 50+ applications
>>    Performance metrics, stats and reports that give you Actionable Insights
>>    Deep dive visibility with transaction tracing using APM Insight.
>>    http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>    _______________________________________________
>>    Scikit-learn-general mailing list
>>    [email protected]
>>    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> 
> 
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud 
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> 
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> -- 
>    Gael Varoquaux
>    Researcher, INRIA Parietal
>    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>    Phone:  ++ 33-1-69-08-79-68
>    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
> 
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud 
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] RFC (also by users) on interpreting 1d X

Reply via email to