On Thu, Sep 10, 2015 at 09:52:44AM -0400, Andy wrote: > I feel it is quite awkward if the function and the estimator have > different requirements on X.
That's a point of view. But they are different things, so I am not sure that this point of view is universal. > And your statement "they are for advanced users" is not manifested in > the API or documentation. OK, but that's a bug of the documentation. > There is no reason a user would expect one to act different from the other. Users who don't code aglorithms probably don't have any reason to be using them. > Why do you say the functions have less overhead? They don't have to do things like parameter validation, and all the book-keeping that goes with maintaining the consistent state of the object. > And why are they more natural to use? People writing algorithms are not used to think in terms of objects. > cluster_centers = kmeans(X, n_clusters=10) > is a bit shorter than > cluster_centers = KMeans(n_clusters=10).fit_predict(X) > but the difference is really not that much. Functions implement algorithms. With an input and an ouptut. Objects implement a predictor, constrained by what we define is a predictor. It's not obvious for a given algorithm, what the corresponding prediction API is. The input might not always be a data matrix, and the output is not always naturally by one of our methods. In this respect, the k-means problem is a good example. People writing algorithms using k-means do not think in terms of 'fit_predict'. There is of course value to have objects: if some of the operations, or the inner state of the algorithm, are reused, the objects are great. But if we just want to write for instance a parallel loop, functions can be better (no internal state is a good thing when dealing with concurrency). Gaƫl ------------------------------------------------------------------------------ Monitor Your Dynamic Infrastructure at Any Scale With Datadog! Get real-time metrics from all of your servers, apps and tools in one place. SourceForge users - Click here to start your Free Trial of Datadog now! http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general