Hm, I'm not entirely sure how score_samples is currently used, but I think it is the probability under a density model. It would "only" change the meaning in so far as it is a conditional distribution over y given x and not x.
I'm not totally opposed to adding a new method, though I'm not sure I like ``predict_proba_at`` On 07/29/2015 12:29 PM, Jan Hendrik Metzen wrote: > I am not sure about the name, score_samples would sound a bit strange > for a conditional probability in my opinion. And likelihood is also > misleading since its actually a conditional probability and not a > conditional likelihood (the quantities on the right-hand side of > conditioning are fixed and integrating over all y would be 1). > > On 29.07.2015 16:16, Andreas Mueller wrote: >> Shouldn't that be "score_samples"? >> Well, it is a conditional likelihood p(y|x), not p(x) or p(x, y). >> But it is the likelihood of some data given the model. >> >> >> On 07/29/2015 02:58 AM, Jan Hendrik Metzen wrote: >>> Such a predict_proba_at() method would also make sense for Gaussian >>> process regression. Currently, computing probability densities for GPs >>> requires predicting mean and standard deviation (via "MSE") at X and >>> using scipy.stats.norm.pdf to compute probability densities for y for >>> the predicted mean and standard-deviation. I think it would be nice to >>> allow this directily via the API. Thus +1 for adding a method like >>> predict_proba_at(). >>> >>> Jan >>> >>> On 29.07.2015 06:42, Mathieu Blondel wrote: >>>> Regarding predictions, I don't really see what's the problem. Using >>>> GLMs as an example, you just need to do >>>> >>>> def predict(self, X): >>>> if self.loss == "poisson": >>>> return np.exp(np.dot(X, self.coef_)) >>>> else: >>>> return np.dot(X, self.coef_) >>>> >>>> A nice thing about Poisson regression is that we can query the >>>> probability p(y|x) for a specific integer y. >>>> https://en.wikipedia.org/wiki/Poisson_regression >>>> >>>> We need to decide an API for that (so far we have used predict_proba >>>> for classification so the output was always n_samples x n_classes). >>>> How about predict_proba(X, at_y=some_integer)? >>>> >>>> However, this is also mean that we can't use predict_proba to detect >>>> classifiers anymore... >>>> Another solution would be to introduce a new method >>>> predict_proba_at(X, y=some_integer)... >>>> >>>> Mathieu >>>> >>>> >>>> On Wed, Jul 29, 2015 at 4:19 AM, Andreas Mueller <t3k...@gmail.com >>>> <mailto:t3k...@gmail.com>> wrote: >>>> >>>> I was expecting there to be the actual poisson loss implemented in >>>> the class, not just a log transform. >>>> >>>> >>>> >>>> On 07/28/2015 02:03 PM, josef.p...@gmail.com >>>> <mailto:josef.p...@gmail.com> wrote: >>>>> Just a comment from the statistics sidelines >>>>> >>>>> taking log of target and fitting a linear or other model doesn't >>>>> make it into a Poisson model. >>>>> >>>>> But maybe "Poisson loss" in machine learning is unrelated to the >>>>> Poisson distribution or a Poisson model with E(y| x) = exp(x >>>>> beta). ? >>>>> >>>>> Josef >>>>> >>>>> >>>>> On Tue, Jul 28, 2015 at 2:46 PM, Andreas Mueller >>>>> <t3k...@gmail.com <mailto:t3k...@gmail.com>> wrote: >>>>> >>>>> I'd be happy with adding Poisson loss to more models, thought >>>>> I think it would be more natural to first add it to GLM >>>>> before GBM ;) >>>>> If the addition is straight-forward, I think it would be a >>>>> nice contribution nevertheless. >>>>> 1) for the user to do np.exp(gbmpoisson.predict(X)) is not >>>>> acceptable. This needs to be automatic. It would be best if >>>>> this could be done in a minimally intrusive way. >>>>> >>>>> 2) I'm not sure, maybe Peter can comment? >>>>> >>>>> 3) I would rather contribute sooner, but other might thing >>>>> differently. Silently ignoring sample weights is not an >>>>> option, but you can error if they are provided. >>>>> >>>>> Hth, >>>>> Andy >>>>> >>>>> >>>>> On 07/23/2015 08:52 PM, Peter Rickwood wrote: >>>>>> Hello sklearn developers, >>>>>> >>>>>> I'd like the GBM implementation in sklearn to support >>>>>> Poisson loss, and I'm comfortable in writing the code (I >>>>>> have modified my local sklearn source already and am using >>>>>> Poisson loss GBM's). >>>>>> >>>>>> The sklearn site says to get in touch via this list before >>>>>> making a contribution, so is it worth me to submitting >>>>>> something along these lines? >>>>>> >>>>>> If the answer is yes, some quick questions: >>>>>> >>>>>> 1) The simplest implementation of poisson loss GBMs is to >>>>>> work in log-space (i.e. the GBM predicts log(target) rather >>>>>> than target), and require the user to then take the >>>>>> exponential of those predictions. So, you would need to do >>>>>> something like: >>>>>> gbmpoisson = >>>>>> sklearn.ensemble.GradientBoostingRegressor(...) >>>>>> gbmpoisson.fit(X,y) >>>>>> preds = np.exp(predict(X)) >>>>>> I am comfortable making changes to the source for this to >>>>>> work, but I'm not comfortable changing any of the >>>>>> higher-level interface to deal automatically with the >>>>>> transform. In other words, other developers would need to >>>>>> either be OK with the GBM returning transformed predictions >>>>>> in the case where "poisson" loss is chosen, or would need to >>>>>> change code in the 'predict' function to automatically do >>>>>> the transformation is poisson loss was specified. Is this OK? >>>>>> 2) If I do contribute, can you advise what the best tests >>>>>> are to test/validate GBM loss functions before they are >>>>>> considered to 'work'? >>>>>> >>>>>> 3) Allowing for weighted samples is in theory easy enough to >>>>>> implement, but is not something I have implemented yet. Is >>>>>> it better to contribute code sooner that doesn't handle >>>>>> weighting (i.e. just ignores sample weights), or later that >>>>>> does? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Cheers, and thanks for all your work on sklearn. Fantastic >>>>>> tool/library, >>>>>> >>>>>> >>>>>> >>>>>> Peter >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Scikit-learn-general mailing list >>>>>> Scikit-learn-general@lists.sourceforge.net >>>>>> <mailto:Scikit-learn-general@lists.sourceforge.net> >>>>>> >>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________ >>>>> Scikit-learn-general mailing list >>>>> Scikit-learn-general@lists.sourceforge.net >>>>> <mailto:Scikit-learn-general@lists.sourceforge.net> >>>>> >>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> >>>>> _______________________________________________ >>>>> Scikit-learn-general mailing list >>>>> Scikit-learn-general@lists.sourceforge.net >>>>> <mailto:Scikit-learn-general@lists.sourceforge.net> >>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Scikit-learn-general mailing list >>>> Scikit-learn-general@lists.sourceforge.net >>>> <mailto:Scikit-learn-general@lists.sourceforge.net> >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> >>>> _______________________________________________ >>>> Scikit-learn-general mailing list >>>> Scikit-learn-general@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> > ------------------------------------------------------------------------------ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general