Hello, I am trying to tune the bandwidth for my KernelDensity. I need to find out what optimization goal to use.
I started with from sklearn.grid_search import GridSearchCVgrid = GridSearchCV(KernelDensity(), {'bandwidth': np.linspace(0.1, 1.0, 30)}, cv=20) # 20-fold cross-validationgrid.fit(x[:, None])print grid.best_params_ From https://jakevdp.github.io/blog/2013/12/01/kernel-density-estimation/#Bandwidth-Cross-Validation-in-Scikit-Learn I have also used RandomizedSearchCV to optimize the parameters. The problem I have is that neither refines the answer so if I don't sample at high enough density I don't get a good answer. What I would like to do is use the same goal but put it into a different global optimizer. I have looked through the code for GridSearchCV and RandomizedSearchCV and I have not been able to figure out yet what is the actual optimization goal. Originally I thought the system was using something like kde_bw = KernelDensity(kernel='gaussian', bandwidth=bw) score = max(cross_val_score(kde_bw, data, cv=3)) and then trying to minimize that score but it does not seem likely given the results. If someone could help me with the goal to optimize I should be able to solve the rest of the problem on my own. Thanks Bill
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn