Hello all,
Long time lurker, first time emailer.
I have two small contributions I would like to propose to the email list.
I was working on a project this weekend that was using both categorical and
numerical columns to predict a final output. I needed to save my
transformations to make future p
Hi Dale,
Those two issues you mention are indeed current bottlenecks of sklearn's
API, but we are currently working on trying to solve them:
1) ColumnTransformer to be able to apply different transformers to
different columns: https://github.com/scikit-learn/scikit-learn/pull/9012/
2) As you men
Yes, but what is used to decide the optimal output? I saw on the document,
it is the best output in terms of inertia. What does that mean?
Thanks.
On Wed, Feb 14, 2018 at 7:46 PM, Joel Nothman
wrote:
> you can repeatedly use n_init=1?
>
> ___
> scikit-
Inertia simply means the sum of the squared distances from sample points to
their cluster centroid. The smaller the inertia, the closer the cluster members
are to their cluster centroid (that's also what KMeans optimizes when choosing
centroids). In this context, the elbow method may be helpful