Hi Lee,
The scoring parameter, if not an existing scoring name, needs to be a
function with the signature:
fn(estimator, X, y_true) -> score which increases with goodness
So I think you want to define:
def score_clusters(estimator, X, y):
return v_measure_score(y[:,0], kmeans.labels_))
Then construct the GridSearchCV as:
estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
scoring=score_clusters)
It seems like there should be more predefined scorers available for
clustering...
Cheers,
- Joel
On 14 May 2014 09:10, Lee Zamparo <zamp...@gmail.com> wrote:
> Hi,
>
> I'm trying to use GridSearchCV and Pipeline to tune the gamma
> parameter of kernel PCA. I'd like to use kernel PCA to transform the
> data, followed by kmeans to cluster the data, followed by v-measure to
> measure the goodness of fit of the clustering.
>
> Here's the relevant snippet of my script
> -----
> # Set up the kPCA -> kmeans -> v-measure pipeline
> kpca = KernelPCA(kernel="rbf")
> kmeans = KMeans(n_clusters=3)
> pipe = Pipeline(steps=[('kpca', kpca), ('kmeans', kmeans)])
>
> # Range of parameters to consider for gamma in the RBF kernel for kPCA
> gammas = np.logspace(-10,2,num=100)
>
> # Parameters of pipelines are set using ‘__’ separated parameter names:
> estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
> scoring=v_measure_score(labels[:,0],kmeans.labels_))
> estimator.fit(D_scaled)
>
> -----
>
> Yet I get an AttributeError claiming that the kmeans object has no
> labels_ attribute.
>
> File "/home/lee/projects/SdA_reduce/utils/kernel_pca_pipeline.py",
> line 86, in <module>
> estimator = GridSearchCV(pipe, dict(kpca__gamma=gammas),
> scoring=v_measure_score(labels[:,0],kmeans.labels_))
>
> AttributeError: 'KMeans' object has no attribute 'labels_'
>
> Does anyone have any tips on how I should restructure my snippet to
> get my desired outcome?
>
> Thanks,
>
> Lee.
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos.
> Get unparalleled scalability from the best Selenium testing platform
> available
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general