Hello sklearn developers,

I'd like the GBM implementation in sklearn to support Poisson loss, and I'm
comfortable in writing the code (I have modified my local sklearn source
already and am using Poisson loss GBM's).

The sklearn site says to get in touch via this list before making a
contribution, so is it worth me to submitting something along these lines?

If the answer is yes, some quick questions:

1) The simplest implementation of poisson loss GBMs is to work in log-space
(i.e. the GBM predicts log(target) rather than target), and require the
user to then take the exponential of those predictions. So, you would need
to do something like:
          gbmpoisson = sklearn.ensemble.GradientBoostingRegressor(...)
          gbmpoisson.fit(X,y)
          preds = np.exp(predict(X))
I am comfortable making changes to the source for this to work, but I'm not
comfortable changing any of the higher-level interface to deal
automatically with the transform. In other words, other developers would
need to either be OK with the GBM returning transformed predictions in the
case where "poisson" loss is chosen, or would need to change code in the
'predict' function to automatically do the transformation is poisson loss
was specified. Is this OK?

2) If I do contribute, can you advise what the best tests are to
test/validate GBM loss functions before they are considered to 'work'?

3) Allowing for weighted samples is in theory easy enough to implement, but
is not something I have implemented yet. Is it better to contribute code
sooner that doesn't handle weighting (i.e. just ignores sample weights), or
later that does?




Cheers, and thanks for all your work on sklearn. Fantastic tool/library,



Peter
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to