Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-11-05 Thread Jacob Vanderplas
Sorry about that oversight in the design! A common test to catch those sorts of inconsistencies would be useful. The biggest problem is that KernelDensity is not fundamentally a classifier, regressor, or transformer, but a density estimator. When I initially did the KDE pull request, I floated the

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-11-05 Thread Andy
Fix here: https://github.com/scikit-learn/scikit-learn/pull/3826 -- ___ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-11-05 Thread Andy
Huh, I thought KernelDensity was a classifier, but apparently it is not. You should be able to grid-search over "score" once the function call is fixed to "score(X, y=None)". Can you try adding that? I am surprised there is no common test for that, but I guess this estimator is too special in it

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-11-05 Thread Michael Eickenberg
On Wed, Nov 5, 2014 at 1:52 PM, Kyle Kastner wrote: > In addition to the y=None thing, KDE doesn't have a transform or predict > method - and I don't think Pipeline supports score or score_samples. > That may have been the crucial thing I have missed :) -- Indeed KDE would have to be at the end

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-11-05 Thread Kyle Kastner
In addition to the y=None thing, KDE doesn't have a transform or predict method - and I don't think Pipeline supports score or score_samples. Maybe someone can comment on this, but I don't think KDE is typically used in a pipeline. In this particular case the code *seems* reasonable (and I am surp

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-11-05 Thread Michael Eickenberg
Hi José, yes, there seems to be an inconsistency, KernelDensity.fit has signature (self, X) and not (self, X, y=None) as is usually the case even if y is never used, see https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neighbors/kde.py#L113 I think the generally accepted way of re

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-11-05 Thread José Guilherme Camargo de Souza
Hi all, Is the KernelDensity estimator compatible with pipelines? When I try to use it inside one pipe1 = make_pipeline(StandardScaler(with_mean=True, with_std=True), KernelDensity(algorithm="auto", kernel="gaussian", metric="euclidean")) params = dict(kerneldens

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-10-23 Thread José Guilherme Camargo de Souza
Hi Jacob, Thanks a lot for your detailed answer! José Guilherme On Tue, Oct 21, 2014 at 3:03 PM, Jacob Vanderplas wrote: > Hi Jose, > The KDE implementation does work on multivariate data, and will in general > work for multimodal data as well. There are two caveats to that: > > 1. In the skle

Re: [Scikit-learn-general] Question about KernelDensity implementation

2014-10-21 Thread Jacob Vanderplas
Hi Jose, The KDE implementation does work on multivariate data, and will in general work for multimodal data as well. There are two caveats to that: 1. In the sklearn implementation, the bandwidth must be the same across each dimension. If this poses a problem for your data, the data can be scaled