Hi Folks, I have a two class classification problem where the positive labels reside in clusters. A traditional cross validation approach is not aware of this issue and splits data points from a cluster in to training and test set giving rise to strong classification performance. I have written a custom cross validation routine where I hold data points from each cluster either in training or in test set (never allowing them to split). Finally I retrain the a Random forest classifier using all the positive set.
My question is : - Can I somehow tune the parameters for a RFC for train the final classifier using these tuned parameters. I do understand that GridSearchCV or Randomised parameter optimisation allows to do this but it follows a traditional CV and splits the clusters I mentioned earlier. Thanks in advance. Mamun ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general