[Scikit-learn-general] Best way to integrate probability density function from GMM? -- Fixed

2014-10-09 Thread Benjamin Blumer
Hi all, I see that gmm.score(x) returns the log probability of x for that point. I'm interested in integrating this probability over a region. For example, finding the probability of a ball being in the space (x,y,z) +/- (delta_x, delta_y, delta_z). In this example, I'd be using past ball locati

[Scikit-learn-general] Best way to integrate probability density function from GMM?

2014-10-09 Thread Benjamin Blumer
Hi all, I see that gmm.score(x) returns the log probability of x for that point. I'm interested in integrating this probability over a region. For example, finding the probability of a ball being in the space (x,y,z) +/- (delta_x, delta_y, delta_z). In this example, I'd be using past ball locat

Re: [Scikit-learn-general] delta idf and bm25

2014-10-09 Thread Lars Buitinck
2014-08-23 21:25 GMT+02:00 Lars Buitinck : > I was just implementing tf-chi2 today (I have a text classification > task to improve anyway), so I might send a PR somewhere over the next > week to at least establish the API. Supervised term weighting is > pretty big, with hundreds of citations for th

Re: [Scikit-learn-general] Using TFxIDF with HashingVectorizer

2014-10-09 Thread Lars Buitinck
2014-09-09 3:36 GMT+02:00 Apu Mishra : > Lars Buitinck writes: > >> The way to combine HV and >> Tfidf is >> >> hashing = HashingVectorizer(non_negative=True, norm=None) >> tfidf = TfidfTransformer() >> hashing_tfidf = Pipeline([("hashing", hashing), ("tidf", tfidf)]) >> > > I notice your use of t

Re: [Scikit-learn-general] Feature selection: floating search algorithm

2014-10-09 Thread Nikolay Mayorov
Hello, Pietro. Thank you for having interest in the subject. The algorithm itself is rather straightforward. Some challenge is to put it into the framework of scikit-learn. In the original paper they evaluated feature subsets using Mahalanobis distance between classes, but it can be any other cr