On Thu, Sep 10, 2015 at 09:52:44AM -0400, Andy wrote:
> I feel it is quite awkward if the function and the estimator have 
> different requirements on X.

That's a point of view. But they are different things, so I am not sure
that this point of view is universal.

> And your statement "they are for advanced users" is not manifested in 
> the API or documentation.

OK, but that's a bug of the documentation.

> There is no reason a user would expect one to act different from the other.

Users who don't code aglorithms probably don't have any reason to be
using them.

> Why do you say the functions have less overhead?

They don't have to do things like parameter validation, and all the
book-keeping that goes with maintaining the consistent state of the
object.

> And why are they more natural to use?

People writing algorithms are not used to think in terms of objects.

> cluster_centers = kmeans(X, n_clusters=10)

> is a bit shorter than

> cluster_centers = KMeans(n_clusters=10).fit_predict(X)

> but the difference is really not that much.

Functions implement algorithms. With an input and an ouptut. Objects
implement a predictor, constrained by what we define is a predictor. It's
not obvious for a given algorithm, what the corresponding prediction API
is. The input might not always be a data matrix, and the output is not
always naturally by one of our methods. In this respect, the k-means
problem is a good example. People writing algorithms using k-means do not
think in terms of 'fit_predict'.

There is of course value to have objects: if some of the operations, or
the inner state of the algorithm, are reused, the objects are great. But
if we just want to write for instance a parallel loop, functions can be
better (no internal state is a good thing when dealing with concurrency).

Gaƫl



------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to