Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-30 Thread Jan Hendrik Metzen
That's true, I wasn't aware that score_samples is used already in the context of density estimation. score_samples would be okay then in my opinion. Jan On 29.07.2015 18:46, Andreas Mueller wrote: Hm, I'm not entirely sure how score_samples is currently used, but I think it is the

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-30 Thread Mathieu Blondel
While the Gaussian distribution has a PDF, the Poisson distribution has a PMF. From Wikipedia (https://en.wikipedia.org/wiki/Probability_mass_function ): A probability mass function differs from a probability density function (pdf) in that the latter is associated with continuous rather than

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-30 Thread Mathieu Blondel
On Thu, Jul 30, 2015 at 11:38 PM, Andreas Mueller t3k...@gmail.com wrote: I am mostly concerned about API explosion. I take your point of PDF vs PMF. Maybe predict_proba(X, y) is better. Would you also support predict_proba(X, y) for classifiers (which would be

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-30 Thread Brian Scannell
I support the inclusion of Poisson loss, although a quick note on predict_prob_at: The output of Poisson regression is a posterior distribution over the rate parameter in the form of a Gamma distribution. If we assume no uncertainty at all in the prediction, the posterior predictive distribution

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-29 Thread Andreas Mueller
Hm, I'm not entirely sure how score_samples is currently used, but I think it is the probability under a density model. It would only change the meaning in so far as it is a conditional distribution over y given x and not x. I'm not totally opposed to adding a new method, though I'm not sure I

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-29 Thread Jan Hendrik Metzen
I am not sure about the name, score_samples would sound a bit strange for a conditional probability in my opinion. And likelihood is also misleading since its actually a conditional probability and not a conditional likelihood (the quantities on the right-hand side of conditioning are fixed

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-29 Thread Jan Hendrik Metzen
Such a predict_proba_at() method would also make sense for Gaussian process regression. Currently, computing probability densities for GPs requires predicting mean and standard deviation (via MSE) at X and using scipy.stats.norm.pdf to compute probability densities for y for the predicted mean

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-29 Thread Andreas Mueller
Shouldn't that be score_samples? Well, it is a conditional likelihood p(y|x), not p(x) or p(x, y). But it is the likelihood of some data given the model. On 07/29/2015 02:58 AM, Jan Hendrik Metzen wrote: Such a predict_proba_at() method would also make sense for Gaussian process regression.

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread Mathieu Blondel
Regarding predictions, I don't really see what's the problem. Using GLMs as an example, you just need to do def predict(self, X): if self.loss == poisson: return np.exp(np.dot(X, self.coef_)) else: return np.dot(X, self.coef_) A nice thing about Poisson regression is that

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread josef.pktd
Just a comment from the statistics sidelines taking log of target and fitting a linear or other model doesn't make it into a Poisson model. But maybe Poisson loss in machine learning is unrelated to the Poisson distribution or a Poisson model with E(y| x) = exp(x beta). ? Josef On Tue, Jul

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread Andreas Mueller
I'd be happy with adding Poisson loss to more models, thought I think it would be more natural to first add it to GLM before GBM ;) If the addition is straight-forward, I think it would be a nice contribution nevertheless. 1) for the user to do np.exp(gbmpoisson.predict(X)) is not acceptable.

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread Andreas Mueller
I was expecting there to be the actual poisson loss implemented in the class, not just a log transform. On 07/28/2015 02:03 PM, josef.p...@gmail.com wrote: Just a comment from the statistics sidelines taking log of target and fitting a linear or other model doesn't make it into a Poisson

[Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-23 Thread Peter Rickwood
Hello sklearn developers, I'd like the GBM implementation in sklearn to support Poisson loss, and I'm comfortable in writing the code (I have modified my local sklearn source already and am using Poisson loss GBM's). The sklearn site says to get in touch via this list before making a