In addition to the y=None thing, KDE doesn't have a transform or predict method - and I don't think Pipeline supports score or score_samples. Maybe someone can comment on this, but I don't think KDE is typically used in a pipeline.
In this particular case the code *seems* reasonable (and I am surprised it doesn't work!), but I don't know much about the KDE stuff. Maybe a bug? On Wed, Nov 5, 2014 at 7:44 AM, Michael Eickenberg < michael.eickenb...@gmail.com> wrote: > Hi José, > > yes, there seems to be an inconsistency, KernelDensity.fit has signature > (self, X) and not (self, X, y=None) as is usually the case even if y is > never used, see > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neighbors/kde.py#L113 > > I think the generally accepted way of remedying this is to just add y=None > in the signature of that function, as was done e.g. for PCA, see > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/pca.py#L206 > > But maybe I am missing something crucial. Happy to make the PR if I am > right about this. > > Michael > > On Wed, Nov 5, 2014 at 1:35 PM, José Guilherme Camargo de Souza < > jose.camargo.so...@gmail.com> wrote: > >> Hi all, >> >> Is the KernelDensity estimator compatible with pipelines? When I try >> to use it inside one >> >> pipe1 = make_pipeline(StandardScaler(with_mean=True, with_std=True), >> KernelDensity(algorithm="auto", >> kernel="gaussian", metric="euclidean")) >> params = dict(kerneldensity__bandwidth=np.logspace(-10, 1, 100)) >> search = GridSearchCV(pipe1, param_grid=params, verbose=1, n_jobs=8, >> cv=5) >> search.fit(feats1) >> search.best_estimator_ >> >> I get a TypeError as follows: >> >> /home/desouza/anaconda/lib/python2.7/site-packages/sklearn/pipeline.pyc >> in fit(self=Pipeline(steps=[('standardscaler', >> StandardScale...euclidean', >> metric_params=None, rtol=0))]), X=array([[ 5.701 , >> 73.6443 , 61.7018 ...2.7188 , >> 0.18243243, 0.21621622]]), y=None, **fit_params={}) >> 125 def fit(self, X, y=None, **fit_params): >> 126 """Fit all the transforms one after the other and >> transform the >> 127 data, then fit the transformed data using the final >> estimator. >> 128 """ >> 129 Xt, fit_params = self._pre_transform(X, y, **fit_params) >> --> 130 self.steps[-1][-1].fit(Xt, y, **fit_params) >> 131 return self >> 132 >> 133 def fit_transform(self, X, y=None, **fit_params): >> 134 """Fit all the transforms one after the other and >> transform the >> >> TypeError: fit() takes exactly 2 arguments (3 given) >> >> Is this an issue or it is supposed not to be compatible? A quick >> search in the mailing list and on stackoverflow did not return any >> entry about this. >> >> Thanks, >> José >> >> >> On Tue, Oct 21, 2014 at 3:03 PM, Jacob Vanderplas >> <jake...@cs.washington.edu> wrote: >> > Hi Jose, >> > The KDE implementation does work on multivariate data, and will in >> general >> > work for multimodal data as well. There are two caveats to that: >> > >> > 1. In the sklearn implementation, the bandwidth must be the same across >> each >> > dimension. If this poses a problem for your data, the data can be scaled >> > before the fit (Using StandardScaler or something similar). >> > 2. The results will depend strongly on the choice of bandwidth: it's >> > important to cross-validate to determine the optimal bandwidth, as is >> done >> > in >> > >> http://scikit-learn.org/stable/auto_examples/neighbors/plot_digits_kde_sampling.html >> > >> > Good luck! >> > Jake >> > >> > >> > Jake VanderPlas >> > Director of Research – Physical Sciences >> > eScience Institute, University of Washington >> > http://www.vanderplas.com >> > >> > On Tue, Oct 21, 2014 at 2:09 AM, José Guilherme Camargo de Souza >> > <jose.camargo.so...@gmail.com> wrote: >> >> >> >> Hi all, >> >> >> >> I would like to ask if the density estimation implementation of scikit >> >> works with multivariate multimodal data. In the digits example [1] it >> >> is clear that it supports multivariate datasets and in the guide >> >> description [2] a 1-D bimodal distribution is used. >> >> >> >> Is it possible to use the same implementation on multivariate >> >> gaussian-shaped data with more than 2 modes? If so, are there any >> >> shortcomings or useful tips when doing that? >> >> >> >> Thanks in advance, >> >> José >> >> >> >> [1] >> >> >> http://scikit-learn.org/stable/auto_examples/neighbors/plot_digits_kde_sampling.html#example-neighbors-plot-digits-kde-sampling-py >> >> [2] >> >> >> http://scikit-learn.org/stable/modules/density.html#kernel-density-estimation >> >> José Guilherme >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Comprehensive Server Monitoring with Site24x7. >> >> Monitor 10 servers for $9/Month. >> >> Get alerted through email, SMS, voice calls or mobile push >> notifications. >> >> Take corrective actions from your mobile device. >> >> http://p.sf.net/sfu/Zoho >> >> _______________________________________________ >> >> Scikit-learn-general mailing list >> >> Scikit-learn-general@lists.sourceforge.net >> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > Comprehensive Server Monitoring with Site24x7. >> > Monitor 10 servers for $9/Month. >> > Get alerted through email, SMS, voice calls or mobile push >> notifications. >> > Take corrective actions from your mobile device. >> > http://p.sf.net/sfu/Zoho >> > _______________________________________________ >> > Scikit-learn-general mailing list >> > Scikit-learn-general@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> > >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general