On 09/10/2015 09:22 AM, Gael Varoquaux wrote: > > These functions are important for reuse in an algorithmic setting: if I > am doing an algorithm that uses k-means or lars_path inside the > algorithm, it is much more natural to use the functions, and they have > less overhead. > > I think that the target usecase for the functions is not the same as for > objects. They target more advanced users who understand better what they > do. For this reason, things like input-parameter validation, that are > heavily tested by the common tests, should probably not be in the > functions (they induce overhead which may be quite important inside an > algorithm). In a sense, I feel that common tests are less important, and > maybe not wanted for functions, as we will be putting expections all the > time. I feel it is quite awkward if the function and the estimator have different requirements on X. And your statement "they are for advanced users" is not manifested in the API or documentation. There is no reason a user would expect one to act different from the other.
Why do you say the functions have less overhead? And why are they more natural to use? cluster_centers = kmeans(X, n_clusters=10) is a bit shorter than cluster_centers = KMeans(n_clusters=10).fit_predict(X) but the difference is really not that much. ------------------------------------------------------------------------------ Monitor Your Dynamic Infrastructure at Any Scale With Datadog! Get real-time metrics from all of your servers, apps and tools in one place. SourceForge users - Click here to start your Free Trial of Datadog now! http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general