Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@srowen Thanks for the comments. Really helpful. I have made a new commit
that addresses the issues you raised:
- I think the use of a global family object does not work well for the
tweedie case, since we have to set the variance power. I now define a tweedie
class and create the tweedie object within the `train` method of
`GeneralizedLinearRegression` where the variance power is set. Does this make
sense?
- I created a constant `delta = 0.1` in the `Family` class which is used to
shift `mu (or y) = 0` to avoid numerical issues.
- I now change `variancePower` to `varPower` to save typing (and consistent
with R and H2o).
- The `project` method you asked about is copied from the other families,
which I suppose is to bound the mean to stabilize estimation.
- I now throw `UnsupportedOperationException` for tweedie AIC.
Let me know if there is any other issues.
Copying others for reviewing. @yanboliang @sethah
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]