I have recently been using grid search to evaluate a custom method for
dimensionality reduction (DR) along with supervised and unsupervised
estimators later in the pipeline to discover its usefulness.

gr = grid_search.GridSearchCV(
​pipeline
, param_grid, cv = None)

​The scoring functions I used are:
1. make_scorer(adjusted_rand_index)​
2. make_scorer(homogenity_score)

The two settings that I have used successfully are:
pipeline1 = [DR method, knn classifier], a case for supervised estimator
pipeline2 = [DR method, kmean clustering], a case for unsupervised estimator

But I am getting an error for the following:
pipeline3 = [DR method, DBScan clustering]
pipeline4 = [DR method, agglomerative clustering]

The reason being that DBScan and agglomerative do not have the "predict"
function in their api. Why this is so?

I am just guessing that, may be this is because it is not possible for
these 2 algorithms to assign cluster labels to unseen data. Correct me if i
am wrong.

Even if this is the case, shouldn't grid search automatically decide to use
either
pred = est.fit(X1).predict(X2) if cv is not None
or
pred = est.fit_predict(X) if cv is None (as in my case above)
based on the "cv" paramerter.

Thanks
Jitesh

[image: --]
Jitesh Khandelwal
<http://about.me/jitesh.khandelwal?promo=email_sig>
[image: http://]about.me/jitesh.khandelwal
<http://about.me/jitesh.khandelwal?promo=email_sig>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to